2024

April 12, 2024
in Backend, Golang
1 min read

Upgrade Golang to 1.21

日常升级.

Changes

Major
- [x] go.mod
- [x] Dockerfile
Minor
- [x] Use builtin slices, slog
- [x] Use builtin context.WithoutCause()

Check

[x] Build all apps' image
[x] Call a RPC which used context.WithoutCause()

About 1.21

https://tonybai.com/2023/08/20/some-changes-in-go-1-21/

分章分成四个部分:

语言本身
编译器与运行时
工具链
标准库

其中比较重要的有: loopvar, GC 优化, slog, 以及兼容性部分

Kimi

总结文章, 给出大纲以及重点, 使用 Markdown 的格式编写:

Go 1.21 版本更新概述:

语言变化
- builtin 包新增预定义函数：min、max 和 clear
- 明确了包初始化顺序算法
- 增强泛型的类型推断能力
- 修正 Go1 中的两个“陷阱”：panic(nil)和 for 循环变量
Go 编译器与运行时变化
- PGO 默认开启
- 降低 GC 尾部延迟
- 支持 WASI
Go 工具链更新
- 向后兼容：扩展和规范化 GODEBUG 的使用
- 向前兼容：go.mod 文件中的 go line 作为约束规则，引入 toolchain 指示符和 GOTOOLCHAIN 环境变量
Go 标准库更新
- log/slog：结构化日志支持
- slices、maps 和 cmp：泛型包加入标准库
- 其他变化：errors.ErrUnsupported、flag.BoolFunc、 net 的多路径 TCP 支持、 reflect.ValueOf 优化、sync.OnceFunc 等
参考资料
- Go 1.21 Release Notes
- Go 1.21 版本发布博客
- 兼容性相关博客
- Godebug 手册
- LoopvarExperiment 页面
- 如何 Golang Evolves without Breaking Programs 文章
- PGO 用户指南

这个大纲总结了文章中关于 Go 1.21 版本的主要更新和变化，包括语言特性的变更、编译器与运行时的改进、工具链的更新以及标准库的新增和改动。同时，列出了文章末尾提供的参考资料链接，方便读者进一步阅读和了解。

January 15, 2024
in DevOps, Database
5 min read

Migrate PostgreSQL Databases Using pglogical

Introduction

To reduce the latency of database queries, we have decided to migrate our Postgres database from one zone to a zone closer to our application.

TLDR

To minimize application downtime, we have chosen continuous migration using pglogical.

We need to modify the database configuration, enable the pglogical extension, and restart the instance. If you have multiple databases to migrate, please check worker registration failed for configuring max_worker_processes correctly.

TOML
# Ref https://github.com/2ndQuadrant/pglogical#quick-setup

wal_level = 'logical'
max_worker_processes = 16   # one per database needed on provider node
                            # one per node needed on subscriber node
max_replication_slots = 16  # one per node needed on provider node
max_wal_senders = 16        # one per node needed on provider node
shared_preload_libraries = 'pglogical'

track_commit_timestamp = on # needed for last/first update wins conflict resolution
                            # property available in PostgreSQL 9.5+

You can use SHOW to check the config parameter:

SQL
SHOW wal_level;
SHOW shared_preload_libraries;

DDL

To start with pglogical, it's crucial to ensure that the database structures of both the provider and subscriber are identical. We employ our goose based migration for this purpose. However when creating a subscription using pglogical.create_subscription(), you can set the synchronize_structure parameter to true, but it requres that the owner of these structures also exists in the subscriber.

Create subscription

On the provider node:

SQL
-- Create a user for logical replication
CREATE ROLE logicalrep WITH LOGIN;
ALTER USER logicalrep WITH PASSWORD 'your_pass';


-- Here we use pglogical extentsion
create extension pglogical;


-- The dsn can be blank
SELECT pglogical.create_node(
    node_name := 'provider1',
    dsn := ''
);

-- Add all tables in public schema to the default replication set
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']);

-- Grant privileges to the logicalrep user
GRANT USAGE ON schema public TO logicalrep;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO logicalrep;
GRANT USAGE ON schema pglogical TO logicalrep;
GRANT SELECT ON ALL TABLES IN SCHEMA pglogical TO logicalrep;

On the subscriber node:

Here we sync the order database.

SQL
CREATE EXTENSION pglogical;
SELECT pglogical.create_node(
    node_name := 'sub',
    dsn := 'dbname=order'
);

-- Create subscription
SELECT pglogical.create_subscription(
    subscription_name := 'sub_order',
    provider_dsn := 'host=your_host port=5432 dbname=order user=logicalrep password=your_pass'
);

-- Check it
--
-- Expected status is `replicating`.
-- If not, check the database log for details (provider or subscriber node).
select * from pglogical.show_subscription_status();

Sync sequences

One common mistake people make is forgetting the last values of sequences, which can lead to duplicated key errors.

We write a script to do this. The new last value will be 1.1 times the origin, factoring in subscription latency. The primary SQL statement is:

SQL
-- Get last values
SELECT * FROM pg_sequences;

-- Apply them to subscriber
ALTER SEQUENCE public.videos_id_seq RESTART new_value;

-- To check the last value of altered sequences, you can use the following SQL.
-- Note that `SELECT * FROM pg_sequences` will show NULL last value after you alter it.
-- Refer to https://gist.github.com/lbbedendo/449ff46d3baa7838b99ec513c2de92a7
SELECT last_value FROM public.videos_id_seq;

Check rows

We've crafted a script to verify the equality of both the row count and a randomly selected subset of rows. See: https://gist.github.com/ionling/10f50bf3d77040fa8bb4f6695c23befe

Drop subsription

Once we confirm that the source and target databases are in a consistent state, we can proceed to drop the subscription.

SQL
SELECT * FROM pglogical.drop_subscription('sub_order');

Solutions

Here, we will explore several solutions for the migration process.

Dump and restore

Dumping the old database and restoring it to the new database is a viable approach. However, it comes with the drawback of causing a significant downtime, which is undesirable. During this downtime, tasks such as database restoration, data verification, and service restarts are necessary, consuming a considerable amount of time.

Streaming replication

PostgreSQL offers native support for streaming replication, facilitating continuous synchronization of data from a source database to a new one. This approach significantly reduces service downtime. By configuring replication in advance, the migration process only requires restarting services, making it more efficient and minimizing downtime.

Multi-master

The best solution is to combine the old and new databases in a multi-master cluster. This allows us to smoothly migrate any service that relies on the old database to the new without any downtime.

One implementation for achieving this is through the spock extension. However, the performance and potential issues of this extension might be a concern. For more insights, you can refer to the blog post: How to achieve multi-master replication in PostgreSQL with Spock

Summary

To strike a balance between complexity and functionality, we ultimately opted for the streaming replication solution.

AWS has a blog on database upgrading that is also useful for migrating:

Notes

Could not open relation with OID

When debugging the subscription, I mistakenly dropped the pglogical extension, deleted the pglogical schema, and subsequently recreated it. As a consequence, I encounter a error when running certain pg commands:

Text Only
1	`Could not open relation with OID`

To resolve it, just reconnect the db.

See https://github.com/2ndQuadrant/pglogical/issues/347

PostgreSQL priveledges

Some priveledges are not intuitive, like this:

SQL
GRANT ALL ON DATABASE mydb TO admin;

grants privileges on the database itself, not things within the database. admin can now drop the database, still without being able to create tables in schema public.^1

ERROR: worker registration failed

When creating a subscription, we encountered the following error:

Text Only
1	`ERROR: worker registration failed, you might want to increase max_worker_processes setting`

According to the pglogical README:

one process per node is needed on the subscriber node.

After ensuring that we do not exceed the limit, we started searching for related documents and encountered two conflicting pieces of information.

One, from EnterpriseDB, states:

One per database + two per subscription on the subscriber (downstream).

The other, from GitHub Issue #7, mentions:

One worker needed for the main maintenance process, one per subscription, one per replicated database, and during startup, it can need one per any connectable database on top of that.

If we have two databases with one subscription for each database:

According to quote one, we will need 2*1 + 2*2 = 6 processes.
According to quote two, we will need 2*2 = 4 processes.

Which one is correct? I don't know.