PostgresSQL

PostgresSQL

Tips

  • Postgres vs. PostgreSQL
    • PostgreSQL 为主,Postgres 作为别名
  • Manual
  • Current Document
  • Postgrest
    • REST API for any Postgres database
  • pRest
    • Golang
  • Awesome PostgreSQL
  • Trees In The Database
  • 默认端口: 5432
  • PG 所带的 bin 列表
    • pgbench
    • pg_resetxlog
    • pg_controldata
    • pg_test_fsync
    • pg_upgrade
    • pg_test_timing
    • pg_xlogdump
    • pg_rewind
    • initdb
    • pg_ctl
    • postgres
    • postmaster
    • pg_archivecleanup
  • 其他工具
  • 资源
  • 数据结构
    • Database / Schema / table
  • 数据迁移
  • 参考
    • Why PostgreSQL doesn't have query hints
    • A PostgreSQL Response to Uber
      • Write Amplification
        • PostgreSQL must update every index if a change to the row updates an index.
        • PostgreSQL keeps each version of the tuple on disk until it is vacuumed.
        • Each page changed here must be pushed down the binary replication link.
      • Replication
        • PostgreSQL does not have logical replication in core. (Coming in 10!)
        • Existing logical replication tools (Slony, Bucardo, etc.) are somewhat fiddly to set up and manage.
      • Replica MVCC
        • Incoming streaming replication activity can be blocked by queries, or queries can be cancelled.
        • Naïve users can be surprised by query cancellation messages.
      • Upgrade
        • PostgreSQL does not have in-place major version upgrade.
        • You have to do some kind of process to get low-downtime upgrades.
        • pg_upgrade, while a big improvement, is not a panacea.
          • PostGIS, for example, is a huge pain.
      • Buffer Pool
        • PostgreSQL’s shared buffer management performance peaks at 8-32GB.
          • [citation required]
        • Larger shared_buffers than that (usually) mean diminishing returns.
        • Retrieving things from file system cache is slower than from shared buffers.
      • Connection Management
        • The PostgreSQL forking model is not efficient for lots of connections, or fast connection cycling.
        • While basic RAM statistics can be misleading, each backend does consume a notable amount of memory.
    • Postgres Hidden Gems
  • Hosted Provider
brew install postgresql
brew postgresql-update database
# POSTGRES_USER=postgres
# POSTGRES_PASSWORD
# POSTGRES_DB
# PGDATA=/var/lib/postgresql/data
# 可以用任意 --user 启动
docker run -it --rm -v $PWD/pg:/var/lib/postgresql/data postgres:alpine
# 设置数据目录, 免得后面再指定
export PGDATA=$HOME/data/pg/data
# 查看配置, -W 可要求提示输入密码
initdb -kU postgres -s
# 生成数据库
initdb -kU postgres
# 启动数据库
pg_ctl -D $PGDATA -l $HOME/data/pg/logfile start
pg_ctl -D $PGDATA -l $PGDATA/pg.log start
# 查看运行状态
pg_ctl -D $PGDATA status
-- 查看当前版本
select version();
-- 将字符串风格为字符数组
select regexp_split_to_array('abc','');

Notes

升级

pg_dumpall -U postgres > dumpfile
/etc/init.d/postgresql stop
apk add -u postgresql
/etc/init.d/postgresql setup
/etc/init.d/postgresql start
psql -U postgres -f dumpfile

备份

# 转储单个库
pg_dump dbname > outfile
psql dbname < infile
# -j 8 并发数
# -F d 目录格式, 并发要求使用目录
# -t 指定表
# -O --no-owner
# -f backup 文件/目录名
pg_dump -F d -f backup -j 8 db -t a -t b -O
# 使用压缩
pg_dump dbname | gzip > filename.gz
gunzip -c filename.gz | psql dbname
# 使用自定义格式, 必须要使用 pg_restore 恢复
pg_dump -Fc dbname > filename
pg_restore -d dbname filename
# 转储所有
pg_dumpall > outfile
psql -f infile postgres
-- 导入 CSV
-- COPY
-- https://www.postgresql.org/docs/current/static/sql-copy.html
-- COPY table_name [ ( column_name [, ...] ) ] FROM { 'filename' | PROGRAM 'command' | STDIN } [ [ WITH ] ( option [, ...] ) ]
-- COPY { table_name [ ( column_name [, ...] ) ] | ( query ) } TO { 'filename' | PROGRAM 'command' | STDOUT } [ [ WITH ] ( option [, ...] ) ]
-- 需要管理员权限
-- 支持 text,binary,csv
-- Windows users might need to use an E'' string and double any backslashes used in the path name.
-- 要求文件在服务器上
COPY phonebook (id,name, phone) FROM '/tmp/phonebook.csv' DELIMITER ',' CSV;
-- 带头导出
COPY phonebook TO '/tmp/data.csv' DELIMITER ',' CSV HEADER;
-- \copy
-- https://www.postgresql.org/docs/current/static/app-psql.html#APP-PSQL-META-COMMANDS-COPY
-- \copy { table [ ( column_list ) ] | ( query ) } { from | to } { 'filename' | program 'command' | stdin | stdout | pstdin | pstdout } [ [ with ] ( option [, ...] ) ]
-- 从客户端进行导入
-- 可以使用相对路径, 会先将文件上传到服务器
\copy out_tmp (id,name) from 'out.csv' DELIMITER ',' CSV;
-- 导出
\copy my_table to 'filename' csv header
\copy (select id,name from out_tmp) to 'exp.csv' DELIMITER ',' CSV;
-- 带头
\copy (select id,name from out_tmp) to 'exp.csv' DELIMITER ',' CSV HEADER;
# 导入 JSON
# JSON 如果包含转移可能会有问题, 例如包含 \n
cat data.json | psql -h localhost -p 5432 feeds -c "COPY news_feed (data) FROM STDIN;"
# 支持导入 CSV 和 JSON
# https://github.com/lukasmartinelli/pgfutter
# create table raw_data(id serial primary key, data jsonb);
pgfutter --host localhost --db db-name --user myuser --schema myschema --table raw_data --jsonb json data.json

optimize

-- 这个操作会非常慢
update a set name = b.name from b where a.id = b.id;
-- 这个操作会比整个更新快
create table c as (select a.age, b.name from a left join b on a.id = b.id);
-- 所以整表更新建议
begin;
create table T as select col1, col2, colN from orders;
drop table orders;
alter table T rename to orders;
commit;
-- 置空一列
-- 数据大了后该操作非常慢
UPDATE orders SET status = null;
-- 因此可以考虑这样
ALTER TABLE orders DROP column status
, ADD column status text;

Install

数据类型

JSON

  • 支持 hstore, json 和 jsonb 类型
    • 大部分时候选择 JSONB
    • 如果只写入或者要求快速写入, 很少查询可以选择 JSON
  • https://www.postgresql.org/docs/current/static/datatype-json.html
  • json 文本, 操作更耗时, 每次需要解析, 会保存重复键值, 以最后一次为准
  • josnb 二进制, 一般磁盘空间更大, 写入更耗时, 支持索引
  • 操作符
    • -> 取字段, 可以是数组索引
    • ->> 返回的值始终为 text, 不会有引号
    • #> 指定多个路径, 获取为 text
      • SELECT '{"a":[1,2,3],"b":[4,5,6]}'::json#>>'{a,2}';
    • @> jsonb, 检测左侧是否包含右侧
    • <@ jsonb, 检测右侧是否包含左侧
    • ? jsonb, 检测是否包含 key 或数组元素
      • SELECT '{"a":1, "b":2}'::jsonb ? 'b';
    • ?| jsonb, 检测是否包含某个 key
      • SELECT '{"a":1, "b":2, "c":3}'::jsonb ?| array['b', 'ceeee', 'e'];
    • ?& jsonb, 是否包含所有 key
    • || jsonb, 拼接两个 jsonb
    • - 删除 kv 或数组元素
      • SELECT '{"a": "b"}'::jsonb - 'a'; SELECT '["a", "b"]'::jsonb - 'a';
      • SELECT '["a", "b"]'::jsonb - (-1); SELECT '["a", "b"]'::jsonb - (1);
    • #- 删除路径
  • FAQ
SELECT '5'::json;
SELECT '[1, 2, "foo", null]'::json;
SELECT '{"bar": "baz", "balance": 7.77, "active": false}'::json;
-- 检测是否包含
SELECT '"foo"'::jsonb @> '"foo"'::jsonb;
SELECT '[1, 2, 3]'::jsonb @> '[3, 1]'::jsonb;
-- 指定一个字段
SELECT data->'field' FROM doc;
-- 创建索引
CREATE INDEX idxgin ON api USING GIN (jdoc);
CREATE INDEX idxginp ON api USING GIN (jdoc jsonb_path_ops);
-- 将数组作为行
SELECT
s ->> 'name'
FROM tab t, jsonb_array_elements(t.family -> 'children') s;

Admin

-- 相当于给其他人 root 权限
grant postgres to someone;
GRANT ALL PRIVILEGES ON DATABASE "my_db" to my_user;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO myuser;
-- What GRANT USAGE ON SCHEMA exactly do? https://stackoverflow.com/q/17338621/1870054
--ACCESS DB
REVOKE CONNECT ON DATABASE nova FROM PUBLIC;
GRANT CONNECT ON DATABASE nova TO user;
--ACCESS SCHEMA
REVOKE ALL ON SCHEMA public FROM PUBLIC;
GRANT USAGE ON SCHEMA public TO user;
--ACCESS TABLES
grant usage , select on all sequences in schema s to u;
REVOKE ALL ON ALL TABLES IN SCHEMA public FROM PUBLIC ;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO read_only ;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO read_write ;
GRANT ALL ON ALL TABLES IN SCHEMA public TO admin ;

NOTIFY

LISTEN virtual;
NOTIFY virtual;
NOTIFY virtual, 'This is the payload';
LISTEN foo;
SELECT pg_notify('fo' || 'o', 'pay' || 'load');
-- NOTIFY on insert
-- Send Notify Procedure
CREATE OR REPLACE FUNCTION my_tab()
RETURNS TRIGGER AS $$
DECLARE
BEGIN
PERFORM pg_notify('my_tab_insert_notify', new.id::TEXT);
RETURN new;
END;
$$ LANGUAGE plpgsql;
-- Trigger
CREATE TRIGGER my_tab_after_insert
AFTER INSERT ON my_tab
FOR EACH ROW EXECUTE PROCEDURE my_tab_notify_insert();

Graph

Extension

-- 查看安装的
\dX;
select * from pg_extension;
-- 查看所有的
SELECT * FROM pg_available_extensions;
-- 查看所有的
CREATE EXTENSION IF NOT EXISTS file_fdw;

postgres-fdw

https://wiki.postgresql.org/wiki/Autonomous_subtransactions