PG-PCP学习记录

[TOC]

第一周

1. PostgreSQL成熟度解读

基本上就是讲了一下PostgreSQL的历史由来。

2. PostgreSQL安装于性能优化

  1. ./configure –help中的部分参数解读

    1. –enable_debug

    2. –enable_dtrace:增加探针,观察动态部分

    3. –enable_cassert:断言

    4. –with-blocksize=8(k) PG数据块大小

      32位系统:32KB x 2^32 => 2^37 KB =>2^7 TB => 128TB

    5. –with-segize: 单表文件超过多少GB分文件,默认1GB

    6. –with-wal-blocksize: wal日志文件超过多大分割,默认8K

    7. –without-readline:查找历史命令

    8. –with-lz4:

    9. –with-perl

    10. –with-python

    11. –with-selinux

    12. –with-uuid

    13. –with-libxml

  2. 配置 ~/.bash_profile 环境变量

    1. PGHOME
    2. PGDATA
    3. PGUSER
    4. PGPORT
    5. LD_LIBRARY_PATH:动态库
  3. Initdb

  4. initdb

    1. -V 查看版本
    2. -E 编码
    3. -U 用户名
    4. -X 放置日志的目录
  5. pg_ctl

    1. –help
    2. start/stop/restart
    3. stop -m smart/fast/immediate :等待完成后断开/回滚事务并断开(默认)/立刻断开(下次启动时会做数据回放)
    4. stop -ms /stop -mf / stop -mi
    5. -D
    6. -l
    7. -p
  6. postgres

    1. pg_ctl 就是封装的 postgres
  7. 查看postgres进程

    1. ps -axjf | grep postgres

    2. Netstat -natp | grep post

    3. netstat -anpl | grep post

    4. netstat -ano | grep PGSQL

    5. ss -tnlp | grep post

    6. lsof -p pid

    7. stract -o output.txt -T -tt trace=all -p pid

  8. pg_isready -p 5432 检查PG服务是否开启

  9. 文件授权给用户与组

    chown -R user:group path

  10. psql

    1. -c
    2. -U
    3. -d
    4. -p
    5. -c
    6. -h
    7. -W

3. PostgreSQL参数配置

  1. postgresql.conf

    1. Include_dir/include_if_exists/include=’special.conf’
    2. Postgresql.auto.conf保存alter system修改后的参数(下面3操作),不要手动修改,会覆盖postgresql.conf
    3. alter system set shared_buffers to ‘32MB’;
    4. show shared_buffers;
    5. select pg_reload_conf();不需要重启的重载
    6. postgres -c configparameter=newvalue (启动时设置,不推荐)
    7. select name,setting from pg_settings where name = ‘xxx’;
    8. select current_setting(xxx);
    9. show xxx;/show all
    10. set work_mem=”10MB”;
    11. Database级别配置
      1. alter database name set work_mem to ‘10MB’;
      2. alter database name reset work_mem;
    12. Session级别配置
      1. set xxx to ‘xxx’;
      2. set xxx to default;
      3. update pg_settings set setting = ‘xxx’ where name = ‘xxx’;
      4. update pg_settings set setting = reset_val where name = ‘xxx’;
      5. select set_config(‘xxx’, new_value,false);
    13. Role级别配置
      1. alter role name in database database_name set work_mem to ‘10MB’;
      2. alter role name in database database_name reset work_mem;

4. PostgreSQL防火墙

  1. pg_hba.conf
    1. type:local/host
    2. database:xxx/all/replication
    3. user: xxx/all
    4. address: ip_address/
    5. method:trust/reject/md5/sha-256
  2. conninfo

5. PostgreSQL客户端工具的使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
psql --help
-U 用户
-d 数据库
-p 端口
-c 执行SQL命令 -c 'xxx' -c 'xxx'
-f 执行文件 x.sql
-t 不输出表头
-x 竖着打印 一般 -x -c 'xxx'


\?
\conninfo 连接信息
\du
\d
\di
\dv
\dt
\ds
\ds+

\copy test1 to '/home/postgres/test1.sql'
\! cat 1.sql
\set ECHO_HIDDEN on -- 打开\什么的执行SQL细节 等于外部 psql -E 进来后操作可看到相同结果
\errverbose -- SQL执行错误可以看到详细错误信息
select 1; watch 1;-- 每1秒执行一次前面的SQL
\x -- 竖向查询结果展示

6. PostgreSQL体系结构初探

数据库集群(实例)database cluster

多进程架构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
ps -axjf | grep postgres


-- postmaster
总进程

-- checkpoint - share buffer 数据刷盘
show max_wal_size;
show checkpoint_timeout;
select * from pg_stat_bgwriter; -- 查看状态

-- background writer
定时的写一部分share buffer的数据到磁盘,缓解checkpoint刷盘的IO间次压力。

-- walwriter
定时将wal buffer里的数据刷到磁盘上。

-- autovacuum launcher
新老数据放一起,定时的去fork一些worker,定时的去清理。
show autovacuum_vacuum_scale_factor;--触发因子

-- stats collector
select * from pg_stat_all_tables where relname = 'test';-- 追踪和收集统计信息

show track_activities; -- track_counts=on 如果=off就不会自动收集了。

-- logical replication launcher
fork worker进程

-- archiver
定时归档进程

-- wal sender

-- wal receiver

不能使用kill -9,会导致实例服务失败重启,进而实例失败重启。

7. PostgreSQL故障排查

  1. 查看操作系统的日志 /var/log/message

  2. 查看pg的日志 $PGDATA/log

    1
    2
    3
    4
    5
    # postgresql.conf
    log_destination='csvlog' # 标准输出
    logging_collector=on # 打开日志收集
    log_rotation_size=10MB # 单个日志大小上线
    log_truncate_on_rotation=off # 同名日志是追加还是truncate
  3. 端口、data文件权限、

8. PostgreSQL SQL技巧

  • PG关键字

    部分关键字不能做列名 或者 使用

    1
    select * from pg_get_keywords();
  • 单引号

    输出字符串

  • $A$XXXX$A$

    Dollar Quoting:$$ XXX $$ ,用$框定无需转译的字符串内容,两个$中间可以写一个标识的名称,比如A;

  • create table

  • drop table

  • insert table [select] / values (),()

  • AS 别名

  • char(n) 、varchar(n)、Text

    char(n):定长,空格补全

    varchar(n): 实际长度

    Text:变长,Toast

  • 表可以成为字段类型

    1
    2
    3
    4
    5
    create table t1(id serival,name varchar(20));
    create table t2(id serival,tt1 t1);

    insert into t2(id, tt1) values (nextval('t2_id_seq'), (nextval('t1_id_seq'), 'xiaoming'));
    select * from t2;

第二周

9. PostgreSQL序列

1
2
3
4
5
6
CREATE [ TEMPORARY | TEMP ] SEQUENCE [ IF NOT EXISTS ] name
[ AS data_type ]
[ INCREMENT [ BY ] increment ]
[ MINVALUE minvalue | NO MINVALUE ] [ MAXVALUE maxvalue | NO MAXVALUE ]
[ START [ WITH ] start ] [ CACHE cache ] [ [ NO ] CYCLE ]
[ OWNED BY { table_name.column_name | NONE } ]
  • TEMPORARY | TEMP

    临时序列,只存在当前会话,会话关闭序列删除;

  • AS data_type

    序列的类型,smallint, integer, or bigint ; 数据范围分别是2^16/2-1,2^32/2-1,2^64/2-1;

  • INCREMENT [BY]

    序列增量,默认为1,也可设置为-1;

  • MINVALUE

    最小序列,如果增量是-1,可以写 NO MINVALUE 自动设置最小值,序列超出会报错

  • MAXVALUE

    最大序列,如果增量是-1,可以写 NO MAXVALUE 自动设置最大值,序列超出会报错

  • START WITH 1

    设置起始值

  • RESTART WITH 1

    重设起始值

  • CACHE 10

    设置每次获取多少个序列段到内存中,默认是1;

    1
    2
    alter sequence t1_id_seq cache 10;
    alter sequence t1_id_seq cache 1;
  • NO CYCLE | CYCLE

    当序列超过最大值后,是否从最小值开始继续循环;

    1
    2
    alter sequence t1_id_seq NO CYCLE;
    alter sequence t1_id_seq CYCLE;
  • OWNED BY t1.id

    序列所属,单向,从表上看见的还是绑定的原序列,如果要更新,需要更新修改表的序列;

    1
    alter table t1 alter column id set default nextval('t11_id_seq');

序列的查询

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 序列定义信息
\d t1_id_seq

# 序列所属信息
\ds t1_id_seq
\ds+ t1_id_seq

# 序列实际信息
select * from t1_id_seq;

# 简单用法
select nextval('t1_id_seq');
select currval('t1_id_seq');
select setval('t1_id_seq', 1);

10. COPY快速加载

1
2
3
4
5
insert into t1 values (1, 'test');
insert into t1 values (1, 'test'),(2,'test');

-- 批量注入测试数据
insert into t1 select n,'test' from generate_series(1,100000) as n;
1
2
3
\copy t1 to '/home/postgres/t1.csv';
\copy t1 to '/home/postgres/t1.csv' with csv header;
\copy t1 to '/home/postgres/t1.csv' with csv header delimiter '|';

\copy t1 to 事务安全

\copy t1 from 事务不安全

  • 注意:导出语法是什么,导入就要用一样的,否则会报错(只是to变from)

pgloader

pg_bulkload

11. 分区表与性能与运维

传统分区表

利用对表操作的触发器,出发存储过程的判断并将数据指定插入的对应的子表。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
-- 创建存储过程
create or replace function p_t_trigger()
returns trigger as $$
begin
if (new.date >='2023-01-01'::date and new.date <= '2023-06-30'::date) then insert into p_t2 values (new.*);
elsif (new.date >='2023-07-01'::date and new.date <= '2023-12-31'::date) then insert into p_t3 values (new.*);
elsif (new.date >='2022-01-01'::date and new.date <= '2022-12-31'::date) then insert into p_t4 values (new.*);
else raise exception 'exception code.';insert into p_te values (new.*);
end if;
return null;
end;
$$
language plpgsql;

-- 创建触发器
create or replace trigger insert_trigger before insert on p_t1 for each row
execute procedure p_t_trigger();

-- 创建表(使用check检查是否满足条件)
create table p_t1(id serial, name varchar, date date);
create table p_t2(check(date >='2023-01-01'::date and date <= '2023-06-30'::date)) inherits (p_t1);
create table p_t3(check(date >='2023-07-01'::date and date <= '2023-12-31'::date)) inherits (p_t1);
create table p_t4(check(date >='2022-01-01'::date and date <= '2022-12-31'::date)) inherits (p_t1);
create table p_te() inherits (p_t1);

-- 插入数据
insert into p_t1(name, date) values ('into t1','2022-01-01');
insert into p_t1(name, date) values ('into t1','2023-01-01');
insert into p_t2(name, date) values ('into t2','2023-01-01');
insert into p_t1(name, date) values ('into t1','2023-08-01');
insert into p_t3(name, date) values ('into t3','2023-08-01');

-- 检查分区
select * from p_t1;
select * from p_t2;
select * from p_t3;
select * from p_t4;

-- 清空表
truncate table p_t1;
truncate table p_t2;
truncate table p_t3;
truncate table p_t4;

-- 删除分区字表 - 删除后会丢失子表的数据 并且在下次出发插入到此子表的条件后 会报错
drop table p_t2;

-- 解除分区关系 - 触发器依旧会按照条件找对应的子表并写入数据,但是主表查不到子表的数据
alter table p_t2 no inherit p_t1;

内置分区表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- 创建分区表
create table t2(city_id int not null,logdate date not null,peaktemp int,unitsales int) partition by range(logdate);

-- 创建分区
create table t2_y2023m01 partition of t2 for values from ('2023-01-01') to ('2023-01-31');
create table t2_y2023m02 partition of t2 for values from ('2023-02-01') to ('2023-02-28');
create table t2_y2023m03 partition of t2 for values from ('2023-03-01') to ('2023-03-31');
create table t2_y2023m04 partition of t2 for values from ('2023-04-01') to ('2023-04-30');
create table t2_y2023m05 partition of t2 for values from ('2023-05-01') to ('2023-05-31');
create table t2_y2023m06 partition of t2 for values from ('2023-06-01') to ('2023-06-30');
create table t2_y2023m07 partition of t2 for values from ('2023-07-01') to ('2023-07-31');
create table t2_y2023m08 partition of t2 for values from ('2023-08-01') to ('2023-08-31');
create table t2_y2023m09 partition of t2 for values from ('2023-09-01') to ('2023-09-30');
create table t2_y2023m10 partition of t2 for values from ('2023-10-01') to ('2023-10-31');
create table t2_y2023m11 partition of t2 for values from ('2023-11-01') to ('2023-11-30');
create table t2_y2023m12 partition of t2 for values from ('2023-12-01') to ('2023-12-31');
create table t2_y2024m01 partition of t2 for values from ('2024-01-01') to ('2024-01-31');

-- 增删改查测试
insert into t2 values (1,'2023-08-02',10,20);
explain select * from t2;
explain select * from t2 where logdate = '2023-08-03';
explain delete from t2 where logdate = '2023-08-03';
explain update t2 set logdate = '2023-08-02';
explain update t2 set logdate = '2023-08-02' where logdate = '2023-08-03';

-- 卸载分区
alter table t2 detach partition t2_y2023m12;

-- 挂载分区
alter table t2 attach partition t2_y2023m12 for values from ('2023-12-01') to ('2023-12-31');

-- 添加默认分区
create table t2_default partition of t2 default;
-- 卸载默认分区
alter table t2 detach partition t2_default;
-- 挂载默认分区
alter table t2 attach partition t2_default default;

问题解决

  1. 如果发现插入logdate的日期范围不在当前分区范围内,会产生报错:

    解决这样的问题很简单:1. 添加对应的分区 2. 添加default分区

    1. 添加对应的分区

    2. 添加default分区

  2. 如果卸载了某个区间的分区后插入数据会自动进入partition default表,之后再想挂回被卸载的分区就会出现问题:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    -- 卸载2024-01分区
    alter table t2 detach partition t2_y2024m01;
    -- 卸载2024-01分区后 数据2024-01月数据会插入到default分区
    insert into t2 values (1,'2024-01-04',2,3);

    --因为t2_default有2024年01月的数据,所以挂不上
    alter table t2 attach partition t2_y2024m01 for values from ('2024-01-01') to ('2024-01-31');

    -- 将默认分区卸载掉
    alter table t2 detach partition t2_default;
    -- 成功挂载
    alter table t2 attach partition t2_y2024m01 for values from ('2024-01-01') to ('2024-01-31');
    -- 将default分区挂回去,发现失败
    alter table t2 attach partition t2_default default;

    -- 创建default分区备份表,将数据做一次转换
    create table t2_default_bck (like t2_default including all);
    insert into t2_default_bck select * from t2_default;
    truncate table t2_default;
    alter table t2 attach partition t2_default default;
    insert into t2 select * from t2_default_bck;
    drop table t2_default_bck;

12. 物化视图

普通视图

从explain执行计划看,数据查询依旧是在原始表上操作。

1
2
3
create or replace view t2_view as (
select * from t2
);

物化视图

1
2
3
4
create materialiazed view t2_mview as (
select * from t2
);

扩展

pg_ivm : 提供了一种即时维护,其中在修改基表后立即更新物化视图。

13. SQL开发

Function

1
2
3
4
5
6
7
create or replace function test_func(id integer) returns void $$
declare
begin
-- XXX
return void;
end;
$$ language plpgsql;
  • 返回一个表数据,使用TABLE
1
2
3
4
5
6
create or replace function test_func(id int) returns table(id int, name varchar) as $$
declare
begin
return query select id,name from t2 where id < 100;
end;
$$ language plpgsql;
  • 返回一个表数据,使用SETOF
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
create table t2(id int, name varchar);
create type tt2(id int, name varchar);

create or replace function test_func(id int) returns setof tt2 as $$
declare
begin
return query
select *
from t2
where id < 100;
end;
$$ language plpgsql;

select test_func(1);
select * from test_func(1);
  • 返回多行数据
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- return next 必须与 returns setof连用
drop function if exists test_func;
create or replace function test_func() returns setof t2 as $$
DECLARE
r t2%rowtype;
BEGIN
for r in
select * from t2 order by logdate
loop
return next r;
end loop;
return;
end;
$$ language plpgsql;

select * from test_func();

  • 游标的使用
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
drop function if exists test_func;
create or replace function test_func() returns void as $$
DECLARE
new_cr CURSOR FOR SELECT * FROM t2;
p_record RECORD;
BEGIN
open new_cr;
loop
fetch new_cr into p_record;
exit when not found;
RAISE NOTICE 'Processing person: % %', p_record.city_id, p_record.logdate; --通常采用日志的方式查看
end loop;

close new_cr;
end;
$$ language plpgsql;

select * from test_func();

  • Create User 和 Create Role 的区别

    PG中User与Role等同,不过Create User 会比 Create Role 默认多一个login权限。

  • 授权和移出权限

    1
    2
    GRANT 权限 ON 表范围 TO 角色;
    REVOKE 权限 ON 表范围 FROM 角色;

    特殊的可以控制模式下的某些表的某些列的增删改查;另外不仅针对表,对数据库各类实体对象都有权限控制。

Procedure

14. FDW(外部表)

  • 查询Postgres

  • 查询Mysql

  • 查询Oracle

  • 查询CSV

  • 检索file_fdw扩展文件

    1
    find / -name file_fdw
  • 安装file_fdw扩展

1
2
cd /home/postgres/downloads/postgresql-14.2/contrib/file_fdw
make && make install
  • 创建FDW服务、表 并访问
1
2
3
4
5
6
7
8
9
CREATE EXTENSION file_fdw;
CREATE SERVER filefdw_server FOREIGN DATA WRAPPER file_fdw;
CREATE FOREIGN TABLE txtfiles(id int, name varchar) SERVER filefdw_server OPTIONS (filename '/home/postgres/import.csv', format 'csv');

CREATE TABLE t2(id serial, name varchar);
INSERT INTO t2(name) SELECT md5(n::varchar) FROM generate_series(1,100000) as n;
\COPY t2 to '/home/postgres/import.csv' with DELIMITER ',';

SELECT * FROM txtfiles;

15. 全文检索

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
create table t2(info text);

-- 注意数据量需要大一些 否则不走索引
insert into t2 select md5(random()::text) from generate_series(1,10000) as n;

create index t2_idx on t2(info varchar_pattern_ops);
-- https://www.runoob.com/manual/PostgreSQL/indexes-opclass.html
-- int int4_ops
-- varchar varchar_pattern_ops
-- text text_pattern_ops
-- char bpchar_pattern_ops

explain select * from t2 where info like '%hello'; -- 不走索引
explain select * from t2 where info like '%hello%'; -- 不走索引
explain select * from t2 where info like 'hello%'; -- 走索引

create index t2_idx on t2(reverse(info) varchar_pattern_ops); -- 逆向

explain select * from t2 where info like 'olleh%'; -- 走索引
1
2
3
4
5
6
7
-- 为了使用两边的%都能匹配到索引 使用扩展pg_trgm
create extension pg_trgm;
create index myidx_t2 on t2 using gin(info gin_trgm_ops);
explain select * from t2 where info like '%ab'; -- 走索引
explain select * from t2 where info like 'ab%'; -- 走索引
explain select * from t2 where info like '%ab%'; -- 不走索引(需要至少三个字符)
explain select * from t2 where info like '%abc%'; -- 走索引
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- 基于文本内容切词来进行查询检索
select to_tsvector('Ia ama boy,ia ama girl.') @@ to_tsquery('am'); -- f
select to_tsvector('Ia ama boy,ia ama girl.') @@ to_tsquery('ama'); -- t
select to_tsvector('Ia ama boy,ia ama girl.') @@ to_tsquery('am|ama'); -- t
select to_tsvector('Ia ama boy,ia ama girl.') @@ to_tsquery('ama&am'); -- f
select to_tsvector('Ia ama boy,ia ama girl.') @@ to_tsquery('!ama'); -- f

to_tsvector -- 预处理
to_tsquery -- 查询
\dFp+ --切词词典
\dF+ english -- 英文词典

--后续可了解
ts_delete
ts_filter
ts_headline
tsvector_to_array
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- 全文检索的实例
drop table if exists t1;
drop table if exists full_text_index;

create table t1(id bigserial not null, name text not null, author_key text not null, constraint author_key primary key(id), constraint key unique (author_key));
create table full_text_index(id bigserial not null, text_fti tsvector not null);

create or replace function node_text_fti()
returns trigger AS
$BODY$
begin
if TG_OP = 'INSERT' THEN
insert into full_text_index(text_fti) values(to_tsvector(coalesce(NEW.name,'')));
end if;
return NEW;
end;
$BODY$ LANGUAGE plpgsql VOLATILE COST 100;

CREATE TRIGGER create_author_index
AFTER INSERT ON t1 FOR EACH ROW
EXECUTE PROCEDURE node_text_fti();

-- 因为是切词 所以这里不能使用generate_series 它生成的是一个整体 无法切词
INSERT INTO t1(name,author_key) values ('i am a boy, i am a girl.','12345');
INSERT INTO t1(name,author_key) values ('i am a dog, i am a pig.','123451');

SELECT *
FROM t1 join full_text_index as fti on (t1.id = fti.id)
where fti.text_fti @@ to_tsquery('boy');

-- 中文的可以用插件pg_jiaba

第三周

16. 体系结构深入:总体架构

  • 多进程架构

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    \d
    set search_path to 'new_schema'; -- 切换默认搜索的模式空间
    \d

    \db+
    \d information_schema.*

    select * from pg_namespace;
    set search_path to 'pg_catalog';
    \set ECHO_HIDDEN on -- 打开查询SQL语句显示
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    -- 权限顺序
    revoke connect on database mydb from public;

    psql

    drop database mydb;
    create database mydb;
    revoke connect on database mydb from public;
    grant create on DATABASE mydb TO u1;
    drop user u1 cascade;
    create user u1 password '123';

    \c mydb
    create schema mysm;
    create table mysm.t1(id int);
    insert into mysm.t1 values (1);
    \q

    psql -U u1 -d mydb
    psql
    grant CONNECT ON DATABASE mydb TO u1;
    \q
    psql -U u1 -d mydb

    select * from mysm.t1;
    \c mydb postgres
    grant usage on schema mysm to u1;

    \c mydb u1
    select * from mysm.t1;
    \c mydb postgres
    grant select on table mysm.t1 to u1;
    \c mydb u1
    select * from mysm.t1;
    \q

17. 体系结构升入:进程结构

1
2
ps -axjf | grep [pid] | grep -v 'grep'
select * from pg_stat_bgwriter;
  • Postmater

    Postgres的主进程

  • checkpoint

  • background writer

  • walwriter

  • autovacum launcher

  • stats collector

  • logical replication launcher

18. 体系结构深入:内存结构

  • 本地内存

    • work_mem

      sort、distince、hash操作时需要使用。

      太小会频繁发生与磁盘swap的IO;

      太大会可能会出现内存溢出,配置postgresql.conf 中的log_temp_files=0来监测和输出当占用过多时,有日志提醒,有迹可循;

      1
      2
      show work_mem;
      set work_mem='64MB';-- 会话级别临时调整 只针对当前会话
    • temp_buffer

      临时表数据访问时候使用的内存;

    • maintenance_work_mem

      • vacuum清理dead tuple时,扫描的dead tuple列表放在 maintenance_work_mem中,Restore、Create Index, Create Foreign Key、Alter table 等会使用到。

      • Vacuum执行时会使用内存数计算

        autovacuum_max_workers x autovacuum_work_mem = 3 x 64MB

        autovacuum_work_mem = -1 默认采用 maintenance_work_mem的大小比如这里是64MB;

      • 后续了解

        1
        2
        3
        \d pg_stat_progress_create_index;
        \d pg_stat_progress_vacuum;
        select * from pg_stst_all_table where relname='test';
  • 共享内存

    1
    select * from pg_shmem_allocations;
    • share buffer
      • ring buffer 环形缓冲区
    • wal buffer
    • commit buffer
  • 内存值观测

    1
    2
    3
    4
    5
    6
    ps -aux | grep postgres
    cat /proc/[pid]/smaps -- 内存
    cat /proc/[pid]/smaps | grep zero -- 共享内存大小
    cat /proc/[pid]/smaps
    smem -- 需要安装,暂时没研究

    1
    show shared_buffers; -- pg 共享内存
    1
    pmap -d [pid]

19. 体系结构深入: PostgreSQL数据交换

双缓存

磁盘 => OS缓存(page cache) => PG缓存(buff cache) => 应用

1
cat /proc/meminfo

在执行checkpoint时:

shared_buffers -> page cache –[fsync]–> 磁盘

1
2
3
4
5
show checkpoint_completion_target;
show checkpoint_timeout;
create extension pg_buffercache;
select * from pg_buffercache where isdirty = 't';
checkpoint;

20-1. 体系结构深入: PostgreSQL物理结构(上)

  • 一个PG服务占用一个端口,多个PG服务可以指定多个不同的端口进行;

  • TOAST

    1
    ./configure --help | grep block # 8KB

- 20-2. 体系结构深入: PostgreSQL物理结构(下)

21. 体系结构深入: PostgreSQL表空间

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
\db+ -- 查看表空间
select pg_relation_filepath('t1'); -- 查看当前表的空间地址

oid2name --help
oid2name -o 2604 -- 2604是表的文件名oid查表名
oid2name -t t1 -- 通过表名查文件名oid



mkdir /usr/local/pg14/mytp -p
psql
create tablespace mytp location '/usr/local/pg14/mytp';
create table t3(id int) tablespace mytp;

select pg_relation_filepath('t3');
\dt+ t3;
\q
ll /usr/local/pg14/data/pg_tblspc
cd /usr/local/pg14/mytp
oid2name -o XXX
oid2name -t t3


psql
select * from pg_tablespace;

alter table t1 set tablespace mytp;

-- 调整一些参数
-- EFFECTIVE_IO_CONCURRENCY 预读数据块 MAINTENANCE_IO_CONCURRENCY 维护 RANDOM_PAGE_COST SEQ_PAGE_COST
alter tablespace mytp set(RANDOM_PAGE_COST = 2);

-- 默认表空间切换
set default_tablespace to mytp;

--
show temp_tablespace;

22. 体系结构深入: PostgreSQL记录Page磁盘和内存关系

一个数据库 = 8KB

1
2
3
4
show block_size;

select relpages,reltuples,reltuples/relpages rows_per_page from pg_class where relname = 't1';
select relpages,reltuples from

数据库新增数据是倒着存放的,当一个块被占满后,会请求新块存放。

1
2
3
select pg_relation_filepath('t1');
hexdump -C [filepath]
pg_filedump -- 有时间可以研究

第四周

23. 体系结构深入: PostgreSQL建立会话(连接)的过程

1
2
3
4
5
6
7
pg_ctl -D $PGDATA start 
postgres -D $PGDATA & -- 等价于
postgres --single -D $PGDATA -- 单用户使用 防止数据库最老的表年龄不够了 无法连接登录维护 单用户单进程单线程

有多个postmaster进程,可能是因为连接数过多,系统还没有能够将用户连接进程fork完成,处于中间状态就会出现多个postmaster进程显示的情况,这属于正常情况,所以需要前置连接池。

连接数越多,性能现行下降,pg14有一定的优化。

24. 体系结构深入: PostgreSQL事务日志

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- wal 日志
psql
beign;
create table t1(id int);
insert into t1 values (1);
select txid_current();
insert into t1 values (2);
select txid_current(); -- get translation id [txid]
insert into t1 values (3);
select txid_current();
select pg_relaction_filepath('t1');

du -sh $PADATA/xxx/xxx

ls -lrth
pg_waldump -o txid wal文件

image-20230908131102254 image-20230908131113910

PostgreSQL持久性优化机制——WAL

  • 数据发生变动时
  • commit和checkpoint
1
2
3
4
5
6
7
8
pageinspect 插件

psql
create extension if not exists pageinspect;
select * from heap_page_items(get_raw_page('t1',0));
select * from page_header(get_raw_page('t1',0));
\q
-- 和wal日志的lsn号比较,确认那些wal数据需要刷盘。
1
2
3
show synchronous_commit; -- on 确保commit时候wal日志已经刷盘;off相反
show wal_writer_delay; -- 异步提交延迟时间
select * from pg_settings where name = 'wal_writer_delay';
  • archive 进程(后台归档进程)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    archive_mode
    select * from pg_stat_archiver ;-- 上次归档失败的wal日志
    select * from pg_tables;

    select * from pg_constraint;
    select * from pg_indexes;
    select * from pg_trigger;
    select * from pg_views;

    select pg_postmaster_start_time(); -- 数据库启动时间

    select * from pg_control_system(); -- 数据库参数
    select to_timestamp(((system_identifier>>32) & (2^32-1)::bigint)) from pg_control_system(); -- 数据库创建时间


    select pg_size_pretty(pg_relation_size('t1')); -- 表大小计算
    select pg_get_indexdef(indexrelid) from pg_stat_user_indexes; -- 获取所有索引定义
    \d pg_stat_user_indexes;

25. 体系结构深入:PostgreSQL体系结构总结

1
2
3
4
5
6
7
8
9
postmaster process -- 主控进程、监听连接、fork进程分发连接
pg_hba.conf -- 防火墙
postgresql.conf -- 参数配置
pg_ident.conf -- 配置哪些操作系统用户可以映射为数据库用户

wrok_mem -- 排序、哈希、生成临时数据
maintenance_work_mem --
effective_cache_size --
stats collector process -- 统计信息收集

26.PostgreSQL备份与恢复介绍

1
\copy t1 to '/home/postgres/t1.csv' with csv header delimiter '|';
  • 冷备份

    停机+长时间cp

    1
    cp -r pgdata/ pgdata_bak
  • 热备份

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    pg_ctl stop
    pg_ctl -D data2 stop
    rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak
    initdb

    mkdir /usr/local/pg14/archive_dir2 -p
    echo "archive_mode=on" >> /usr/local/pg14/data/postgresql.conf
    echo "archive_command='cp %p /usr/local/pg14/archive_dir2'" >> /usr/local/pg14/data/postgresql.conf
    pg_ctl -l logs start

    psql
    create table t1(id int);

    begin;
    insert into t1 select n from generate_series(1,10000) as n;
    select * from txid_current();
    commit;

    select pg_switch_wal();
    checkpoint;

    select pg_start_backup('mybc01','t','f');
    select pg_stop_backup('f');
    \q

    cp -R data data2;

    ls -lrth archive_dir2 && ls -lrth data/pg_wal

    rm -rf data2/pg_wal/*
    rm data2/postmaster.pid
    mkdir data2/pg_wal/archive_status
    cp data/pg_wal/000000010000000000000003 data2/pg_wal/000000010000000000000003
    ls -lrth archive_dir2 && ls -lrth data/pg_wal && ls -lrth data2/pg_wal

    # 如果没有提前准备postgresql.conf就用下面的vim+echo
    cp -c postgresql.conf data2/

    #vim data2/postgresql.conf
    #echo "wal_level=replica" >> data2/postgresql.conf
    #echo "port=5433" >> data2/postgresql.conf
    #echo "restore_command='cp /usr/local/pg14/archive_dir2/%f %p'" >> data2/postgresql.conf
    #echo "recovery_target_xid='737'" >> data2/postgresql.conf
    #echo "recovery_target_inclusive=on" >> data2/postgresql.conf # 恢复到xid事务前或者当前事务执行完成后
    #echo "recovery_target_action='promote'" >> data2/postgresql.conf # 恢复后可连接
    touch data2/recovery.signal

    pg_ctl -D data2 -l logs2 start
    psql -p 5433
    select count(1) from t1;
    \q
    pg_ctl -D data2 -l logs2 stop
    • pg_is_in_backup函数

      在系统管理函数中 ,还有一个pg_is_in_backup函数,用于检查当前是否在执行一个排他的备份,即是否有exclusive参数设置为TRUE的备份,不能用它检查是否有非排他的备份在进行。

  1. PostgreSQL导入与导出(逻辑)
1
2
3
4
5
6
7
8
-- 纯文本类型备份(plain text)
-- 因为pg_dumpall只能备份数据库集簇级别,所以使用pg_dump
pg_dump -t 'public.t1' -s -f dump.sql -- 只结构
pg_dump -t 'public.t*' -f dump.sql -- 正则匹配多表
pg_dump -t 'public.t1' -f dump.sql
pg_dump -Fp -t 'public.t1' -f dump.sql
-- 纯文本类型恢复
psql -f dump.sql -- 只有文件类型的用psql恢复
1
2
3
4
-- 备份为文件夹
pg_dump -Fd -t 'public.t1' -f dump_dir
-- 文件夹恢复
pg_restore -d postgres dump_dir
1
2
3
-- 备份为压缩文件
pg_dump -Ft -t 'public.t1' -f dump_dir.zip
pg_restore -d postgres dump_dir.zip

28. PostgreSQL PITR(增量备份与恢复)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# 这个视频很垃圾 从17.14分开始看,前面的别看
pg_basebackup -Fp -P -v -D pgdata_bak
# 校验备份是否有效
pg_verifybackup ../pgdata_bak
# 检查备份的文件指标来源于
more backup_manifest

# 找到事务ID
select txid_current();


# 将最新的wal日志切换
select pg_switch_wal();

select name,setting from pg_settings where name like '%recovery%';
recovery_target_inclusive=on/off # 恢复到当前事务之前 或者 恢复到当前事务结束

# 拷贝归档目录
restore_command = 'cp /home/postgres/archive_dir/%f %p'
# 指定事务号
recovery_target_xid = '事务号ID'
recovery_target_action = 'pause' #只读 ‘promote'可读写 ’shutdown‘关闭

# 创建恢复信号
touch pgdata_bak/recovery.signal

# 查看一些信息
pg_controldata -D pgdata_bak
# 从控制文件取Redo的时间点号

# recovery_target_name 从 pg_create_restore_point();
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# 实操应用
pg_ctl stop
pg_ctl -D pgdata_bak stop
rm -rf data pgdata_bak archive_dir logs logs2
ll
mkdir archive_dir
initdb

cd data

echo "wal_level = 'replica'" >> postgresql.conf
echo "archive_mode=on" >> postgresql.conf
echo "archive_command='cp %p /usr/local/pg14/archive_dir/%f'" >> postgresql.conf
cd ..

pg_ctl -l logs start
# psql
# select pg_start_backup('a1','t','f');
# select pg_stop_backup('f');
# \q

pg_basebackup -Fp -P -v -D pgdata_bak -p 5432 -U postgres
# --pg_receivewal -D pgdata_bak -p 5432

psql
create table t2(id int);

begin;
insert into t2 select n from generate_series(1,1000) as n;
select txid_current();
commit;
select current_timestamp;

begin;
insert into t2 select n from generate_series(1,100) as n;
select txid_current();
commit;
select current_timestamp;

begin;
insert into t2 select n from generate_series(1,10) as n;
select txid_current();
commit;
select current_timestamp;

drop table t2;

select pg_switch_wal();
--checkpoint;
\q


cd pgdata_bak
vim postgresql.conf
# 手动注释掉 archive_mode 和 archive_command


echo "port=5433" >> postgresql.conf
echo "restore_command='cp /usr/local/pg14/archive_dir/%f %p'" >> postgresql.conf
# echo "recovery_target_name='huanyuan'" >> postgresql.conf
#echo "recovery_target_time='2023-09-09 14:14:00.125658+08'" >> postgresql.conf
echo "recovery_target_xid='735'" >> postgresql.conf
# echo "recovery_target_inclusive=on" >> postgresql.conf # 恢复到xid事务前或者当前事务执行完成后
echo "recovery_target_action='promote'" >> postgresql.conf # 恢复后可连接
# cat postgresql.conf
touch recovery.signal
cd ..
pg_ctl -D pgdata_bak -l logs2 start

psql -p 5433
select count(1) from t2;
\q

image-20230906023410350 image-20230906023423699

  • pg_receivewal

    应该是因为开了归档,wal会自行归档到archive_dir里给后面恢复用。
    所以就不用pg_receivewal了,pg_receivewal也是将新发生写满的wal日志歇过来两边是一样的。

    但就是需要手动ctrl+c

29. PostgreSQL升级

  • 小版本升级

    解析安装包,替换执行文件即可;

  • 大版本升级

    • pg_upgrade方式

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      46
      47
      48
      49
      50
      51
      52
      53
      54
      55
      56
      57
      58
      59
      60
      61
      62
      /usr/local/pg13/bin/pg_ctl stop 
      /usr/local/pg14/bin/pg_ctl stop
      rm -rf /usr/local/pg13/13
      rm -rf /usr/local/pg14/14

      # 变更环境变量为Pg13
      vim ~/.bash_profile

      export PATH=/usr/local/pg13/bin:$PATH
      export PGHOME=/usr/local/pg13
      export PGDATA=$PGHOME/data
      export PGUSER=postgres

      source ~/.bash_profile

      # 源码编译安装pg13、pg14

      # 初始化并启动pg13实例
      /usr/local/pg13/bin/initdb -D /usr/local/pg13/13
      /usr/local/pg13/bin/pg_ctl -D /usr/local/pg13/13 -l /usr/local/pg13/logs start
      psql
      create table t1(id int);
      insert into t1 values (1);
      \q
      # 停止13实例
      /usr/local/pg13/bin/pg_ctl -D /usr/local/pg13/13 -l /usr/local/pg13/logs stop -mf

      # 初始化pg14实例
      /usr/local/pg14/bin/initdb -D /usr/local/pg14/14

      # 升级检查
      /usr/local/pg14/bin/pg_upgrade -d /usr/local/pg13/13/ -D /usr/local/pg14/14/ -b /usr/local/pg13/bin/ -B /usr/local/pg14/bin/ --check --link

      /usr/local/pg14/bin/pg_upgrade -d /usr/local/pg13/13/ -D /usr/local/pg14/14/ -b /usr/local/pg13/bin/ -B /usr/local/pg14/bin/ --check --clone

      /usr/local/pg14/bin/pg_upgrade -d /usr/local/pg13/13/ -D /usr/local/pg14/14/ -b /usr/local/pg13/bin/ -B /usr/local/pg14/bin/ --check

      # 升级
      /usr/local/pg14/bin/pg_upgrade -d /usr/local/pg13/13/ -D /usr/local/pg14/14/ -b /usr/local/pg13/bin/ -B /usr/local/pg14/bin/


      /usr/local/pg14/bin/pg_ctl -D /usr/local/pg14/14 -l /usr/local/pg14/logs start

      /usr/local/pg14/bin/vacuumdb --all --analyze-in-stages
      /usr/local/pg14/delete_old_cluster.sh


      # 变更环境变量为Pg14
      vim ~/.bash_profile

      export PATH=/usr/local/pg14/bin:$PATH
      export PGHOME=/usr/local/pg14
      export PGDATA=$PGHOME/data
      export PGUSER=postgres

      source ~/.bash_profile


      psql
      select * from t1;
      \q

    • pg_dumpall方式

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      /usr/local/pg13/bin/pg_ctl -D /usr/local/pg13/13 stop 
      /usr/local/pg14/bin/pg_ctl -D /usr/local/pg14/14 stop
      exit
      rm -rf /usr/local/pg13/13
      rm -rf /usr/local/pg14/14
      mkdir /usr/local/pg13/13 -p
      chown -R postgres.postgres /usr/local/pg13/13
      mkdir /usr/local/pg14/14 -p
      chown -R postgres.postgres /usr/local/pg14/14
      su - postgres
      cd /usr/local/pg13/13


      # 初始化并启动pg13实例
      /usr/local/pg13/bin/initdb -D /usr/local/pg13/13
      echo "port=5432" >> /usr/local/pg13/13/postgresql.conf
      /usr/local/pg13/bin/pg_ctl -D /usr/local/pg13/13 -l logs start


      /usr/local/pg14/bin/initdb -D /usr/local/pg14/14
      echo "port=5433" >> /usr/local/pg14/14/postgresql.conf
      /usr/local/pg14/bin/pg_ctl -D /usr/local/pg14/14 -l logs start

      /usr/local/pg13/bin/psql -p 5432
      create table t1(id int);
      insert into t1 values (1);
      \q

      pg_dumpall -p 5432 | psql -p 5433

      /usr/local/pg13/bin/psql -p 5433
      select * from t1;
      \q

      #======================================================
      # 变更环境变量为Pg14
      vim ~/.bash_profile

      export PATH=/usr/local/pg14/bin:$PATH
      export PGHOME=/usr/local/pg14
      export PGDATA=$PGHOME/data
      export PGUSER=postgres

      source ~/.bash_profile
    • pg_dump + pg_restore方式

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      46
      /usr/local/pg13/bin/pg_ctl -D /usr/local/pg13/13 stop 
      /usr/local/pg14/bin/pg_ctl -D /usr/local/pg14/14 stop
      exit
      rm -rf /usr/local/pg13/13
      rm -rf /usr/local/pg14/14
      mkdir /usr/local/pg13/13 -p
      chown -R postgres.postgres /usr/local/pg13/13
      mkdir /usr/local/pg14/14 -p
      chown -R postgres.postgres /usr/local/pg14/14
      su - postgres
      cd /usr/local/pg13/13


      # 初始化并启动pg13实例
      /usr/local/pg13/bin/initdb -D /usr/local/pg13/13
      echo "port=5432" >> /usr/local/pg13/13/postgresql.conf
      /usr/local/pg13/bin/pg_ctl -D /usr/local/pg13/13 -l logs start


      /usr/local/pg14/bin/initdb -D /usr/local/pg14/14
      echo "port=5433" >> /usr/local/pg14/14/postgresql.conf
      /usr/local/pg14/bin/pg_ctl -D /usr/local/pg14/14 -l logs start

      /usr/local/pg13/bin/psql -p 5432
      create table t1(id int);
      insert into t1 values (1);
      \q

      rm -rf /usr/local/pg13/13/backup.sql
      pg_dump -Ft -p 5432 -v -f/usr/local/pg13/13/backup.sql
      pg_restore -d postgres -p 5433 /usr/local/pg13/13/backup.sql

      /usr/local/pg13/bin/psql -p 5433
      select * from t1;
      \q

      #======================================================
      # 变更环境变量为Pg14
      vim ~/.bash_profile

      export PATH=/usr/local/pg14/bin:$PATH
      export PGHOME=/usr/local/pg14
      export PGDATA=$PGHOME/data
      export PGUSER=postgres

      source ~/.bash_profile
    1
    2
    3
    4
    5
    6
    pg_dump -- 
    pg_dumpall -- # 安全但是慢
    pg_upgrade -- # 内存翻倍 需要停机
    pg_upgrade --link # 文件引用使用但是如果出错 原库不能使用 但是需要停机

    流复制 主(低版本) 从(高版本) -- 最灵活 最安全 但是配置相对复杂

  • pg_dumpall只能备份整个数据库集簇,如果想备份【单一数据库】,使用如下:

    1
    2
    pg_dumpall --globals-only > globals.sql
    pg_dump -d your_database_name >> globals.sql

30. 容灾和高可用解决方案

流复制、逻辑复制、pgpool、repmgr、patroni、Bucardo、pacemaker+corosync、vip

第五周

31. Postgres流复制简介

每一笔操作先记录wal日志,然后通过wal日志再进行落盘。

将wal日志每生成一条就给备库发送一条,备库再进行回放,就可以实现数据库信息同步。

可以实现一主一从,也可以1拖2,一拖3;为了减少主库压力,可以采用链式,主-从-从从…

流复制默认为异步复制,可以手动调整为同步复制。异步复制可能在大数据量时,主库挂了,从库提升为主库,因为异步可能会出现数据不一致的情况。

主库walsender - 从库 walreserver;

判断主从

1
2
3
4
5
6
7
8
9
10
11
12
13
14
walsender - 主 (链式除外)
select * from pg_stat_replication;
select pg_is_in_recovery();
pg_contraldata -D $PGDATA
-- Database cluster stat : in production(主) / (in archive recervy)

synchronous_commit -- 控制事务的同步级别
synchronous_standby_names='node1,node2';
synchronous_standby_names='FIRST 2(node1,node2,node3)';
synchronous_standby_names='ANY 2(node1,node2,node3)';

pg_rewind;

Timeline 区分历史和新wal日志

32-1. PostgreSQL流复制安装与配置(上)

  • 异步

    • 异步1拖1

      来源

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      initdb
      -- 修改postgresql.conf pg_hba.conf
      pg_ctl -l logs start
      psql
      create user u1 with replication;
      alter user postgres password '123';
      \q
      pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R
      tar -zcf pgdata_standby.tar.gz pgdata_standby/
      scp pgdata_standby.tar.gz root@172.16.249.102:/usr/local/pg14/
      scp pgdata_standby.tar.gz root@172.16.249.103:/usr/local/pg14/

      目标

      1
      2
      3
      4
      5
      6
      7
      8
      tar -zxf pgdata_standby.tar.gz
      vim pgdata_standby/postgresql.auto.conf

      # Do not edit this file manually!
      # It will be overwritten by the ALTER SYSTEM command.
      primary_conninfo = 'application_name=s1 user=u1 password=123 host=172.16.249.101 passfile=''/home/postgres/.pgpass'' channel_binding=disable port=5432 sslmode=disable sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'

      pg_ctl -D pgdata_standby -l logs start
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      pg_ctl stop
      pg_ctl -D pgdata_standby stop
      rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak pgdata_standby
      initdb

      pg_ctl start
      psql
      create user u1 with replication;
      \q
      pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R
      echo "port=5433">> pgdata_standby/postgresql.conf
      pg_ctl -D pgdata_standby start

      psql
      create table t1(id int);
      insert into t1 values (1);
      select * from pg_stat_replication;
      \q

      psql -p 5433
      select * from t1;
      select * from pg_stat_replication;
      \q

      pg_ctl -D pgdata_standby stop

      psql
      insert into t1 values (1);
      \q
    • 异步1拖2

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      46
      pg_ctl stop
      pg_ctl -D pgdata_standby stop
      pg_ctl -D pgdata_standby2 stop
      rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak pgdata_standby pgdata_standby2 log*

      initdb
      pg_ctl start
      psql
      create user u1 with replication;
      create table t1(id int);
      insert into t1 values (1);
      select pg_switch_wal();
      checkpoint;
      \q

      pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R
      cp -R pgdata_standby pgdata_standby2
      echo "port=5433">> pgdata_standby/postgresql.conf
      echo "port=5434">> pgdata_standby2/postgresql.conf

      pg_ctl -l log restart
      pg_ctl -D pgdata_standby -l log1 start
      pg_ctl -D pgdata_standby2 -l log1 start

      psql
      insert into t1 values (2);
      select * from t1;
      select * from pg_stat_replication;
      \q

      psql -p 5433
      select * from t1;
      select * from pg_stat_replication;
      \q

      psql -p 5434
      select * from t1;
      select * from pg_stat_replication;
      \q

      pg_ctl -D pgdata_standby stop
      pg_ctl -D pgdata_standby2 stop

      psql
      insert into t1 values (1);
      \q
    • 异步1拖1拖1(级联)

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      46
      47
      48
      pg_ctl stop
      pg_ctl -D pgdata_standby stop
      pg_ctl -D pgdata_standby2 stop
      rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak pgdata_standby pgdata_standby2 log*

      initdb
      pg_ctl start
      psql
      create user u1 with replication;
      create table t1(id int);
      insert into t1 values (1);
      select pg_switch_wal();
      checkpoint;
      \q

      pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R
      echo "port=5433">> pgdata_standby/postgresql.conf

      pg_ctl -l log restart
      pg_ctl -D pgdata_standby -l log1 start


      pg_basebackup -Fp -D pgdata_standby2 -P -v -U u1 -R -p 5432
      echo "port=5434">> pgdata_standby2/postgresql.conf
      echo "primary_conninfo = 'user=u1 passfile=''/home/postgres/.pgpass'' channel_binding=disable port=5433 sslmode=disable sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'">> pgdata_standby2/postgresql.auto.conf

      pg_ctl -l log restart
      pg_ctl -D pgdata_standby2 -l log1 start

      psql
      insert into t1 values (2);
      select * from t1;
      \q

      psql -p 5433
      select * from t1;
      \q

      psql -p 5434
      select * from t1;
      \q

      pg_ctl -D pgdata_standby stop
      pg_ctl -D pgdata_standby2 stop

      psql
      insert into t1 values (1);
      \q
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      -- 乱序笔记
      max_wal_sender 有几个备库就需要配置几
      hot_standby=off--不允许查询(读写分离) on 允许查询,默认为不写为on

      create user u1 with replication;
      pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R
      # -R 会将自动生成的参数写入postgresql.auto.conf中 ,指定流复制的主机参数等信息,建议默认加上

      select * from pg_stat_replication;
      \x

      sent_lsn 主库已发送
      write_lsn 备库已写入
      flush_lsn 备库已刷盘
      replay_lsn 备库回放的lsn

      write_lag 写延迟
      flush_lag 刷盘延迟
      replay_lag 回放延迟

      sync_priority
      sync_state=async(异步)

      pg_controldata -D data | grep state --in production # 主库
      pg_controldata -D pgdata_standby | grep state --in archive recovery # 备库

      select pg_is_in_recovery(); -- t 主库
      select pg_is_in_recovery(); -- f 从库
      ```

32-2 .PostgreSQL流复制安装与配置(下)

  • 同步

    • 单节点

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      pg_ctl stop
      pg_ctl -D pgdata_standby stop
      rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak pgdata_standby
      initdb

      pg_ctl start
      psql
      create user u1 with replication;
      \q
      pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R
      echo "port=5433">> pgdata_standby/postgresql.conf
      echo "synchronous_standby_names='s1'" >> data/postgresql.conf
      vim pgdata_standby/postgresql.auto.conf
      # 添加 application_name=s1

      pg_ctl restart
      pg_ctl -D pgdata_standby start

      psql
      create table t1(id int);
      insert into t1 values (1);
      select * from pg_stat_replication;
      \q

      psql -p 5433
      select * from t1;
      select * from pg_stat_replication;
      \q

      pg_ctl -D pgdata_standby stop

      psql
      insert into t1 values (1);
      \q
    • 多节点

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      46
      47
      48
      49
      50
      51
      52
      53
      54
      55
      56
      57
      58
      59
      60
      61
      62
      63
      64
      65
      66
      67
      68
      69
      70
      pg_ctl stop
      pg_ctl -D pgdata_standby stop
      pg_ctl -D pgdata_standby2 stop
      rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak pgdata_standby pgdata_standby2 log*

      initdb
      pg_ctl start
      psql
      create user u1 with replication;
      --select pg_create_physical_replication_slot('node1');
      --select pg_create_physical_replication_slot('node2');

      create table t1(id int);
      insert into t1 values (1);
      select pg_switch_wal();
      checkpoint;
      \q

      pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R
      echo "port=5433">> pgdata_standby/postgresql.conf

      echo "synchronous_standby_names='s1'" >> data/postgresql.conf
      vim pgdata_standby/postgresql.auto.conf
      # 手动添加 application_name=s1

      pg_ctl -l log restart
      pg_ctl -D pgdata_standby -l log1 start
      psql -c "select * from pg_stat_replication;"


      pg_basebackup -Fp -D pgdata_standby2 -P -v -U u1 -R
      echo "port=5434">> pgdata_standby2/postgresql.conf

      echo "synchronous_standby_names='FIRST 1(s1,s2)'" >> data/postgresql.conf # 's1,s2' == 'FIRST 1(s1,s2)'
      # echo "primary_slot_name=node2">> pgdata_standby2/postgresql.conf
      vim pgdata_standby2/postgresql.auto.conf
      # 手动添加 application_name=s2

      pg_ctl -l log restart
      pg_ctl -D pgdata_standby2 -l log2 start
      psql -c "select * from pg_stat_replication;"

      --psql -c 'select * from pg_stat_replication;'
      psql
      create table t1(id int);
      insert into t1 values (2);
      select * from pg_stat_replication;
      \q

      psql -p 5433
      select * from t1;
      select * from pg_stat_replication;
      \q

      psql -p 5434
      select * from t1;
      select * from pg_stat_replication;
      \q

      pg_ctl -D pgdata_standby stop

      psql
      insert into t1 values (1);
      \q

      pg_ctl -D pgdata_standby2 stop

      psql
      insert into t1 values (1);
      \q
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      -- 继续测试备库切换与数据增量同步
      pg_ctl -D pgdata_standby2 start
      psql
      insert into t1 select n from generate_series(1,100000) as n;
      \q

      psql -p 5434
      select count(1) from t1;
      \q

      pg_ctl -D pgdata_standby start
      psql -p 5433
      select count(1) from t1;
      \q
1
2
3
4
5
6
show synchronous_standby_names; 
synchronous_standby_names='node1,node2';# 等待node1、2都同步完成,主库才算完成
synchronous_standby_names='FIRST 2(node1,node2,node3)';# 等待前两个12节点同步完成才算完成
synchronous_standby_names='ANY 2(node1,node2,node3)';# 等待任意两个节点同步完成就算完成

-- 通过sync(同步流)如果备库挂了,主库就会卡主
1
pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
cd pgdata_standby

vim postgresql.conf
port=5433

pg_ctl -D pg_data_standby start


psql -p 5432
select * from pg_stat_replication; -- 主 record
select * from pg_is_in_recovery(); -- 主 f
\q
psql -p 5433
select * from pg_stat_replication; -- 从 None
select * from pg_is_in_recovery(); -- 从 t
\q

pg_controldata | grep state -- 主
pg_controldata -D pgdata_standby | grep state -- 从
ps -axjf | grep postgres

33-1. PostgreSQL流复制的运维(上)

1. 主从强同步,备库宕机,主库无法写入	
	1. 变更配置,synchronous_standby_name 注释掉;
2. wal日志被覆盖
	1. 找wal日志恢复
	2. 复制槽,slot,保留未同步的备份数据;
3. 延迟备库
4. 流复制冲突
	1. 死锁冲突
	2. 
  • 多节点 + 复制槽
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
pg_ctl stop
pg_ctl -D pgdata_standby stop
pg_ctl -D pgdata_standby2 stop
rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak pgdata_standby pgdata_standby2 log*

initdb
pg_ctl start
psql
create user u1 with replication;
select pg_create_physical_replication_slot('node1');
select pg_create_physical_replication_slot('node2');

create table t1(id int);
insert into t1 values (1);
select pg_switch_wal();
checkpoint;
\q

pg_basebackup -Fp -D pgdata_standby -P -v -U u1 -R
echo "port=5433">> pgdata_standby/postgresql.conf

echo "synchronous_standby_names='s1'" >> data/postgresql.conf
echo "primary_slot_name='node1'">> pgdata_standby/postgresql.conf
vim pgdata_standby/postgresql.auto.conf
# 手动添加 application_name=s1

pg_ctl -l log restart
pg_ctl -D pgdata_standby -l log1 start
psql -c "select * from pg_stat_replication;"


pg_basebackup -Fp -D pgdata_standby2 -P -v -U u1 -R
echo "port=5434">> pgdata_standby2/postgresql.conf

echo "synchronous_standby_names='FIRST 1(s1,s2)'" >> data/postgresql.conf # 's1,s2' == 'FIRST 1(s1,s2)'
echo "primary_slot_name='node2'">> pgdata_standby2/postgresql.conf
vim pgdata_standby2/postgresql.auto.conf
# 手动添加 application_name=s2

pg_ctl -l log restart
pg_ctl -D pgdata_standby2 -l log2 start
psql -c "select * from pg_stat_replication;"
psql -c "select * from pg_replication_slots;" -- 复制槽信息

--psql -c 'select * from pg_stat_replication;'
psql
create table t1(id int);
insert into t1 values (2);
select * from pg_stat_replication;
\q

psql -p 5433
select * from t1;
select * from pg_stat_replication;
\q

psql -p 5434
select * from t1;
select * from pg_stat_replication;
\q

pg_ctl -D pgdata_standby stop

psql
insert into t1 values (1);
\q

pg_ctl -D pgdata_standby2 stop

psql
insert into t1 values (1);
\q

33-2. PostgreSQL流复制的运维(下)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- 复制槽的有利有弊
select count(1) from pg_replication_slots where active = 'f';

-- 归档如果失败 wal日志可能会报错 通过以下sql监控
select * from pg_stat_archiver;

-- 延迟备份 -- 接收日志不回放的时间
show recovery_min_apply_delay;

-- 流复制冲突
备库操作导致的主库流复制回放出问题;
select * from pg_stat_database_conflicts;

--
show hot_standby_feedback;

34. PostgreSQL流复制深度理解

1
2
3
4
5
6
synchronous_commit
- off 不管
- local 本地落盘即可
- remote_write 传到远程即可
- on 远程落盘才行
- remote_apply 不仅要远程落盘 还得回放成功才行

35. PostgreSQL逻辑复制原理

流复制是针对整个数据库实例的,也就是正对data的,逻辑复制是针对表的,目前也只针对表;

36. PostgreSQL逻辑复制配置

注意逻辑复制不会同步DDL语句,所以需要订阅端自己先创建一个同样的表

单向逻辑复制会自动创建一个逻辑复制槽,双向本地需自行创建并指定。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
cd /usr/local/pg14
pg_ctl stop
pg_ctl -D pgdata_standby stop
pg_ctl -D pgdata_standby2 stop
rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak pgdata_standby pgdata_standby2 log*

initdb

echo "wal_level = logical" >> data/postgresql.conf
pg_ctl -l logs start
psql
--select * from pg_prepared_xacts;
--rollback prepared 's1'; -- 复制槽的长事务
create table t2(id int primary key, name varchar);
create publication p1 for table t2;
select * from pg_publication;
select * from pg_publication_tables;
select * from pg_publication_rel;
select pg_switch_wal();
checkpoint;
\q

pg_basebackup -Fp -P -v -D pgdata_standby # 只是想复制一个数据库实例用一下而已 不过逻辑复制要求复制表的所有条件都要一致,订阅段的表字段可以多,但是基础发布端的字段、属性都必须一致,包括数据库名称、模式名称等。

cd pgdata_standby
echo "port=5433" >> postgresql.conf
cd ..
pg_ctl -D pgdata_standby start


psql -p 5433
create subscription ss1 connection 'host=localhost port=5432' publication p1;
\q

psql
insert into t2 values (1,'123');
\q

psql -p 5433
select * from t2;
\q
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- 复制槽重名问题
psql
create table t1(id int);
select pg_create_logical_replication_slot('sub2');
create publication sub2 for table t1;
select pg_drop_replication_slot('sub2');

create publication pub1 for all tables;
truncate table t2;-- 也是可以的的
create subscription s1 connection 'host=localhost port=5432' publication p1 with (copy_data=false);

create subscription s1 connection 'host=localhost port=5432' publication p1 with (enable=false);
alter subscription s1 enable;

--同步发布
--异步发布
  • 删除再添加逻辑复制订阅,订阅表中就会多刷一遍记录
1
2
3
show max_standby_streaming_delay;
show hot_standby_feedback;
select enumvalsfrom pg_settings where name = 'synchronous_commit';

第六周

37. PostgreSQL逻辑复制:双向复制

  • 不同实例间

    小心互相循环复制,会搞爆内存,必须在两边发布的时候,没有交叉的正在使用的同名表。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    -- 循环订阅的情况
    psql
    drop publication if exists p1;
    drop table if exists t1;
    create table t1(id int);
    create publication p1 for table t1;
    \q

    psql -p 5433
    drop publication if exists p2;
    drop table if exists t1;
    create database test owner=postgres;
    \c test
    create table t1(id int);
    create publication p2 for table t1;
    create subscription s2 connection 'host=localhost port=5432 dbname=postgres user=postgres' publication p1;
    \q

    psql
    create subscription s1 connection 'host=localhost port=5433 dbname=test user=postgres' publication p2;
    insert into t1 values (1);
    select * from t1;
    \q
  • 双向逻辑复制实例 - 相同实例不同库之间

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    pg_ctl stop
    pg_ctl -D pgdata_standby stop
    pg_ctl -D pgdata_standby2 stop
    rm -rf archive_dir2 data data2 logs logs2 archive_dir pgdata_bak pgdata_standby pgdata_standby2 log*

    initdb
    echo "wal_level = logical" >> data/postgresql.conf
    pg_ctl -l log start
    psql

    create database test owner=postgres;
    create user u1 with replication;

    create table t1(id int primary key, name varchar);
    create table t2(id int primary key, name varchar);
    --创建逻辑复制槽
    select pg_create_logical_replication_slot('node1', 'pgoutput');
    create publication p1 for table t1;
    insert into t1 values (1,'liming');


    \c test
    create table t1(id int primary key, name varchar);
    create table t2(id int primary key, name varchar);

    select pg_create_logical_replication_slot('node2', 'pgoutput');
    create publication p2 for table t2;
    insert into t2 values (2,'xiaohong');
    create subscription s2 connection 'host=localhost port=5432 dbname=postgres user=postgres' publication p1 with (slot_name=node1, create_slot=false);

    \c postgres
    create subscription s1 connection 'host=localhost port=5432 dbname=test user=postgres' publication p2 with (slot_name=node2, create_slot=false);


    select * from t1;
    select * from t2;
    \c test
    select * from t1;
    select * from t2;
    \q
  • 不同实例不同库之间 - 也就是个端口不一样而已

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    -- 以下为随记 待整理
    pg_ctl -D pgdata_standby stop
    pg_ctl stop
    rm -rf data
    rm -rf pgdata_standby
    initdb
    echo "wal_level = logical" >> ./data/postgresql.conf
    pg_ctl -l logs start

    psql
    \c postgres
    select pg_create_logical_replication_slot('myslot', 'pgoutput');
    --select pg_drop_replication_slot('myslot');
    --select * from pg_replication_slots;
    create table t1(id int);
    insert into t1 values (1);
    create publication p1 for table t1;

    create database test;
    \c test
    create table t1(id int);
    create subscription s1 connection 'host=localhost port=5432 dbname=postgres' publication p1 with (slot_name='myslot', create_slot=false);
    select * from t1;
    \q


    -- ERROR: could not drop replication slot "s1" on publisher: ERROR: replication slot "s1" does not exist
    alter subscription s1 disable;
    alter subscription s1 set (slot_name=None);
    drop subscription s1;


38. PostgreSQL逻辑复制:运维

39. PostgreSQL性能优化:执行计划

1
2
3
4
5
6
7
select setting,name from pg_settings where name like '%cost%';

成本按照预定义的值进行计算

-- 统计信息更新时 比例因子
show autovacuum_analyze_scale_factor;
show autovacuum_analyze_threshold;

40. PostgreSQL性能优化:TOP SQL

正常pg_stat_statements模块会在pg的contrib文件下,如果不知道怎么找的可以直接磁盘搜索:

1
sudo find / -name pg_stat_statements;
  • 安装pg_stat_statements

    1
    2
    cd xxx/pg_stat_statements;
    make && make install
  • 配置pg_stat_statements使用共享内存

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    vim xxx/data/postgresql.conf

    #加载pg_stat_statements模块
    shared_preload_libraries='pg_stat_statements'
    #如果要跟踪IO消耗的时间,需要打开如上参数
    track_io_timing = on
    #设置单条SQL的最长长度,超过被截断显示(可选)
    track_activity_query_size = 2048

    # 在postgresql.conf最后添加以下配置参数
    # 在pg_stat_statements中最多保留多少条统计信息,通过LRU算法,覆盖老的记录。
    pg_stat_statements.max = 10000
    # all - (所有SQL包括函数内嵌套的SQL), top - 直接执行的SQL(函数内的sql不被跟踪), none - (不跟踪)
    pg_stat_statements.track = all
    # 是否跟踪非DML语句 (例如DDL,DCL),on表示跟踪, off表示不跟踪
    pg_stat_statements.track_utility = off
    # 重启后是否保留统计信息
    pg_stat_statements.save = on


    pg_ctl -l logs restart
  • 启用扩展

    1
    2
    3
    psql
    create extension pg_stat_statements;
    select * from pg_stat_statements;
  • 学习 pg_stat_statement;

    1
    2
    3
    没有时间戳,可以加一个时间戳列,然后按照时间戳定时更新。

    函数中采集对于事务是不会变的,需要执行pg_stat_clear_snapshot();-- 刷新快照

41. PostgreSQL性能优化:优化措施

  1. 使用连接池

  2. 减少往返(insert 批量插入)

  3. 减少优化器/规划器花销

  4. 磁盘速度于RAM

  5. 分离事务日志

  6. 内存加大点、CPU用好一点

  7. work_mem调整

  8. maintenance_work_mem调整

  9. fsync确保打开

  10. wal_buffers

  11. free space map

  12. Checkpoint_timeout

  13. checkpoint_warning – 提示

  14. 索引

42-1. PostgreSQL日常运维 - 1

多版本控制 - update delete 并没有立刻删除数据,而是新增了一条或者打了个标记

create extension pageinspect;

select * from heap_page_items(get_raw_page(‘test’,0));

Xxx_fsm 空闲映射

xxx_vm 可见性映射

– 手动释放死元组

vacuum [table] - 整理块 不合并

Vacuum full [table] - 整理块 重组 合并(8级锁 耗时大)

42-2. PostgreSQL日常运维 - 2

pg_repack - 不会锁表

索引膨胀 - 重建索引 reindex index index_name;

年龄冻结 - 每20亿事务最好释放一次事务ID;

42-3. PostgreSQL日常运维 - 3