搭建高可用Etcd集群

本文中我们将探索如何配置一个启用TLS的3节点Etcd集群。

Etcd是云原生生态中最受欢迎的开源项目之一,它是云原生计算基金会(CNCF)孵化的项目,目前已经成为Kubernetes基础架构的核心构件。

Etcd集群采用raft算法选举Leader, 最小raft集群需要3个参与者,所以一个Etcd集群最少需要3台虚拟机.

环境准备

准备三台Linux主机,主机名分别为etcd1, etcd2, etcd3

无特殊说明均关闭防火墙

主机名 IP
etcd1 172.19.184.7
etcd2 172.19.184.8
etcd3 172.19.184.9

在每台主机上运行以下命令

1
2
3
4
5
6
cat >> /etc/hosts <<EOF
# etcd hosts
172.19.184.7 etcd1
172.19.184.8 etcd2
172.19.184.9 etcd3
EOF

禁用防火墙

1
2
systemctl stop firewalld
systemctl disable firewalld

禁用Selinux

1
2
setenforce 0
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

下载etcd二进制文件

在每个Linux host上,运行以下命令以下载和安装最新版本的二进制文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
ETCD_VER=v3.5.0

# choose either URL
GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GOOGLE_URL}

rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
rm -rf /tmp/etcd-download-test && mkdir -p /tmp/etcd-download-test

curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp/etcd-download-test --strip-components=1
rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz

/tmp/etcd-download-test/etcd --version
/tmp/etcd-download-test/etcdctl version
/tmp/etcd-download-test/etcdutl version

# move them to the bin folder
mv /tmp/etcd-download-test/etcd /usr/local/bin
mv /tmp/etcd-download-test/etcdctl /usr/local/bin
mv /tmp/etcd-download-test/etcdutl /usr/local/bin

生成和分发证书

有很多方式可以创建CA证书和私钥,其中比较流行的有两种

  • openssl
  • cfssl

这里使用 cfssl 来生成私钥ca.key和证书ca.crt

1
2
3
4
export CFSSL_URL="https://pkg.cfssl.org/R1.2"
wget "${CFSSL_URL}/cfssl_linux-amd64" -O /usr/local/bin/cfssl
wget "${CFSSL_URL}/cfssljson_linux-amd64" -O /usr/local/bin/cfssljson
chmod +x /usr/local/bin/cfssl /usr/local/bin/cfssljson

建立一个名为certs的目录,并运行以下命令为每台主机生成CA证书和server证书及密钥组合。

1
mkdir certs && cd certs

生成证书

无特殊说明,目录为certs

cfssl

创建CA证书

首先,创建CA证书,它将被所有的etcd server和客户端使用。

1
2
echo '{"CN":"CA","key":{"algo":"rsa","size":2048}}' | cfssl gencert -initca - | cfssljson -bare ca -
echo '{"signing":{"default":{"expiry":"43800h","usages":["signing","key encipherment","server auth","client auth"]}}}' > ca-config.json

这将生成4个文件ca-key.pemca.pemca.csrca-config.json

image-20210719113029134

接下来,我们将为第一个节点生成证书和密钥

1
2
3
export NAME=etcd1
export ADDRESS=172.19.184.7,$NAME
echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -config=ca-config.json -ca=ca.pem -ca-key=ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $NAME

对接下来的两个节点也重复以上步骤。

1
export NAME=etcd2export ADDRESS=172.19.184.8,$NAMEecho '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -config=ca-config.json -ca=ca.pem -ca-key=ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $NAME
1
export NAME=etcd3export ADDRESS=172.19.184.9,$NAMEecho '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -config=ca-config.json -ca=ca.pem -ca-key=ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $NAME

image-20210719112956383

不要忘记用自己的组合替换IP地址以及节点名称。

此时,我们已经为CA和所有三个节点生成了证书和密钥。

image-20210719112904325

openssl

生成ca证书
1
openssl genrsa -out ca.key 2048openssl req -x509 -new -nodes -key ca.key -subj "/CN=CA" -days 10000 -out ca.crt

image-20210719133916721

创建etcd证书和私钥

通过生成的CA证书和私钥生成 etcd 的证书和私钥

创建配置文件etcd-ca.conf

1
cat > etcd-ca.conf <<EOF[ req ]default_bits = 2048prompt = nodefault_md = sha256req_extensions = req_extdistinguished_name = dn    [ dn ]C = CNST = ShangHaiL = ShangHaiO = etcdOU = systemCN = etcd    [ req_ext ]subjectAltName = @alt_names    [ alt_names ]DNS.2 = etcd1DNS.3 = etcd2DNS.4 = etcd3IP.2 = 172.19.184.7IP.3 = 172.19.184.8IP.4 = 172.19.184.9    [ v3_ext ]authorityKeyIdentifier=keyid,issuer:alwaysbasicConstraints=CA:FALSEkeyUsage=keyEncipherment,dataEnciphermentextendedKeyUsage=serverAuth,clientAuthsubjectAltName=@alt_namesEOF

生成密钥

1
openssl genrsa -out etcd.key 2048

生成证书签发请求(certificate signing request)

1
openssl req -new -key etcd.key -out etcd.csr -config etcd-ca.conf

生成证书

1
openssl x509 -req -in etcd.csr -CA ca.crt -CAkey ca.key \-CAcreateserial -out etcd.crt -days 10000 \-extensions v3_ext -extfile etcd-ca.conf

验证证书

1
openssl verify -CAfile ca.crt etcd.crt

image-20210719134509318

分发证书

开始分发证书到集群的每个节点。

运行以下命令,替换用户名和IP地址,将证书复制到相应的节点上。

1
scp ca.pem root@etcd1:etcd-ca.crtscp etcd1.pem root@etcd1:server.crtscp etcd1-key.pem root@etcd1:server.keyscp ca.pem root@etcd2:etcd-ca.crtscp etcd2.pem root@etcd2:server.crtscp etcd2-key.pem root@etcd2:server.keyscp ca.pem root@etcd3:etcd-ca.crtscp etcd3.pem root@etcd3:server.crtscp etcd3-key.pem root@etcd3:server.key

进入每个节点,并运行以下命令将证书移动到适当的目录中。

1
mkdir -p /etc/etcdmv etcd-ca.crt server.crt server.key /etc/etcdchmod 600 /etc/etcd/server.key

通过OpenSSL生成的证书通过以下方法分发证书:

将CA证书ca.crt、etcd证书etcd.crt和秘钥etcd.key, 拷贝到各节点的/etc/etcd/目录中。

配置文件中也要注意修改证书名称。

我们完成了每个节点上证书的生成和分发。下一步,我们将为每个节点创建配置文件和 Systemd 单元文件。

配置和启动etcd集群

在节点1上,在/etc/etcd目录中创建一个名为etcd.conf的文件,包含以下内容:

1
cat << EOF > /etc/etcd/etcd.confETCD_NAME=etcd1ETCD_LISTEN_PEER_URLS="https://172.19.184.7:2380"ETCD_LISTEN_CLIENT_URLS="https://172.19.184.7:2379"ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"ETCD_INITIAL_CLUSTER="etcd1=https://172.19.184.7:2380,etcd2=https://172.19.184.8:2380,etcd3=https://172.19.184.9:2380"ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.19.184.7:2380"ETCD_ADVERTISE_CLIENT_URLS="https://172.19.184.7:2379"ETCD_TRUSTED_CA_FILE="/etc/etcd/etcd-ca.crt"ETCD_CERT_FILE="/etc/etcd/server.crt"ETCD_KEY_FILE="/etc/etcd/server.key"ETCD_PEER_CLIENT_CERT_AUTH=trueETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/etcd-ca.crt"ETCD_PEER_KEY_FILE="/etc/etcd/server.key"ETCD_PEER_CERT_FILE="/etc/etcd/server.crt"ETCD_DATA_DIR="/data/etcd"EOF

节点2配置

1
cat << EOF > /etc/etcd/etcd.confETCD_NAME=etcd2ETCD_LISTEN_PEER_URLS="https://172.19.184.8:2380"ETCD_LISTEN_CLIENT_URLS="https://172.19.184.8:2379"ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"ETCD_INITIAL_CLUSTER="etcd1=https:/172.19.184.7:2380,etcd2=https://172.19.184.8:2380,etcd3=https://172.19.184.9:2380"ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.19.184.8:2380"ETCD_ADVERTISE_CLIENT_URLS="https://172.19.184.8:2379"ETCD_TRUSTED_CA_FILE="/etc/etcd/etcd-ca.crt"ETCD_CERT_FILE="/etc/etcd/server.crt"ETCD_KEY_FILE="/etc/etcd/server.key"ETCD_PEER_CLIENT_CERT_AUTH=trueETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/etcd-ca.crt"ETCD_PEER_KEY_FILE="/etc/etcd/server.key"ETCD_PEER_CERT_FILE="/etc/etcd/server.crt"ETCD_DATA_DIR="/data/etcd"EOF

节点3配置

1
cat << EOF > /etc/etcd/etcd.confETCD_NAME=etcd3ETCD_LISTEN_PEER_URLS="https://172.19.184.9:2380"ETCD_LISTEN_CLIENT_URLS="https://172.19.184.9:2379"ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"ETCD_INITIAL_CLUSTER="etcd1=https://172.19.184.7:2380,etcd2=https://172.19.184.8:2380,etcd3=https://172.19.184.9:2380"ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.19.184.9:2380"ETCD_ADVERTISE_CLIENT_URLS="https://172.19.184.9:2379"ETCD_TRUSTED_CA_FILE="/etc/etcd/etcd-ca.crt"ETCD_CERT_FILE="/etc/etcd/server.crt"ETCD_KEY_FILE="/etc/etcd/server.key"ETCD_PEER_CLIENT_CERT_AUTH=trueETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/etcd-ca.crt"ETCD_PEER_KEY_FILE="/etc/etcd/server.key"ETCD_PEER_CERT_FILE="/etc/etcd/server.crt"ETCD_DATA_DIR="/data/etcd"EOF

请不要忘记更换你的网络专用IP地址。

配置完成后,我们就可以在每个节点上创建 systemd 单元文件了。

/lib/system/systemd处创建文件etcd.service,内容如下:

1
cat << EOF > /lib/systemd/system/etcd.service[Unit]Description=etcd key-value storeDocumentation=https://github.com/etcd-io/etcdAfter=network.target[Service]Type=notifyEnvironmentFile=/etc/etcd/etcd.confExecStart=/usr/local/bin/etcdRestart=alwaysRestartSec=10sLimitNOFILE=40000[Install]WantedBy=multi-user.targetEOF

由于每个节点的配置都被移到了专用文件/etc/etcd/etcd.conf中,所以所有节点的单元文件保持不变。

启动服务

现在我们已经准备好启动服务了。在每个节点上运行下面的命令来启动etcd集群:

1
systemctl daemon-reloadsystemctl start etcd# 确保etcd服务已经启动,并且运行中没有出现错误。systemctl status etcdsystemctl enable etcd

启动报错

启动 etcd1 节点的 etcd 服务后,发现服务无法启动,可使用journalctl -xe命令或查看系统日志tail -f /var/log/messages
看到以下关于 etcd 的报错信息:

image-20210719123743510

报错原因:这时etcd1尝试去连接etcd2、etcd3,但是etcd2、3的etcd服务此时还未启动,因此需要先启动etcd2和3的etcd服务,再去启动etcd1。

测试和验证集群

SSH进入其中一个节点,通过etcd CLI连接到集群。

1
etcdctl --endpoints https://172.19.184.7:2379 --cert /etc/etcd/server.crt --cacert /etc/etcd/etcd-ca.crt --key /etc/etcd/server.key put foo bar

我们在etcd数据库中插入一个密钥。让我们看看能否找回它。

1
etcdctl --endpoints https://172.19.184.7:2379 --cert /etc/etcd/server.crt --cacert /etc/etcd/etcd-ca.crt --key /etc/etcd/server.key get foo

image-20210719131635176

接下来,让我们使用API端点(endpoint)来检查集群的健康状态。

1
curl --cacert /etc/etcd/etcd-ca.crt --cert /etc/etcd/server.crt --key /etc/etcd/server.key https://172.19.184.7:2379/health

image-20210719132452371

最后,让我们确保所有的节点都参与到集群中。

1
etcdctl --endpoints https://172.19.184.7:2379 --cert /etc/etcd/server.crt --cacert /etc/etcd/etcd-ca.crt --key /etc/etcd/server.key member list

image-20210719132549507

Congratulations!现在你拥有了一个安全、分布式的以及高可用的etcd集群,

etcd有要求,如果--listen-client-urls被设置了,那么就必须同时设置--advertise-client-urls,所以即使设置和默认相同,也必须显式设置

参考链接

如何设置一个生产级别的高可用etcd集群

搭建高可用Etcd集群 (TLS)