0%

DoH服务

1
~$ apt-get install dnscrypt-proxy -y
  • Open the file/etc/dnscrypt-proxy/dnscrypt-proxy.tomlin your favorite editor. Find the general section and change the server_namesvariable.
1
server_names = ['cloudflare']
  • And change the nameserver inside /etc/resolv.conf.
    1
    2
    3
    ~$ cat /etc/resolv.conf
    # Generated by dhcpcd
    nameserver 127.0.0.1

NextCloud

Docker-compose搭建nextcloud+collabora/code(无HTTPS)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
nextcloud$ cat docker-compose.yml
version: "3"
services:
pg11:
image: postgres:11-alpine
environment:
POSTGRES_DB: ${DB_NAME}
POSTGRES_PASSWORD: ${DB_PASS}
POSTGRES_USER: ${DB_USER}
logging:
driver: "none"
restart: unless-stopped
volumes:
- ./db-data:/var/lib/postgresql/data
env_file:
- ./.env

collabora:
image: collabora/code:latest
restart: always
environment:
- username=${ADM_USER}
- password=${ADM_PASS}
# 注意下面这一行参数很重要.
- "extra_params=--o:ssl.enable=false --o:net.post_allow.host=192.168.1.100 --o:storage.wopi.host=192.168.1.100 --o:ssl.termination=false"
ports:
- "9080:9980"
cap_add:
- MKNOD
env_file:
- ./.env
volumes:
- /etc/hosts:/etc/hosts:ro

nextcloud:
image: nextcloud
restart: unless-stopped
depends_on:
- pg11
cap_add:
- MKNOD
- ALL
volumes:
- ./volumes/html:/var/www/html
- ./volumes/config:/var/www/html/config
- ./volumes/custom_apps:/var/www/html/custom_apps
- ./volumes/data:/var/www/html/data
- ./volumes/theme:/var/www/html/themes
- /etc/localtime:/etc/localtime:ro
- /etc/hosts:/etc/hosts:ro
ports:
- 8080:80

  • 如果安装nextcloud是没有开启HTTPS访问的,就无法在http://cloud.domain.im/settings/apps页面里显示可安装的插件列表.这里也可以自己手动从https://apps.nextcloud.com下载插件包,手动解压到nextcloud/html/apps里,再启用它,就可以像是在线安装的那样使用了.并且如上面所示,需要在本机的/etc/hosts内加入一条如:192.168.1.100 cloud.domain.im这样的记录,并且把它挂载到容器中去,这样可以使用域名来测试,用域名来生成安全证书.

  • 安装Collabora Online服务器功能

    1
    2
    3
    4
    ~$ wget -c https://github.com/nextcloud-releases/richdocuments/releases/download/v5.0.3/richdocuments-v5.0.3.tar.gz
    ~$ wget -c https://github.com/CollaboraOnline/richdocumentscode/releases/download/21.11.306/richdocumentscode.tar.gz
    ~$ cd volumes/html/apps
    ~$ tar xvf *.gz && sudo chown www-data:www-data -R richdocuments richdocumentscode
  • 配置Collabora Online服务器的URL.

    • 打开http://cloud.domain.im:8080/settings/admin/richdocuments页,选择Use your own server,这里填入http:://cloud.domain.im:9080后,按Save保存配置,如果上面的状态显示绿色的钩,并且显示Collabora Online server is reachable.表示已经连接到服务器了.
  • 现在进去到网盘目录里,如果能打开docx,odt这样的文件,表示安装成功,Collaboar Online

  • 打开http://cloud.domain.im:9080/browser/dist/admin/admin.html会有服务器相关的管理监控信息.网盘里当前被Collabora Online打开编辑文件,也会显示在Documents open的列表内.

Self-Signed HTTPS版本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
nextcloud$ tree -L 2
.
├── collabora
│   └── coolwsd.xml
├── create_self-sign-tls.sh
├── db-data [error opening dir]
├── docker-compose.yml
├── nginx
│   ├── certs
│   ├── nginx.conf
│   ├── sites-enabled
│   └── ssl
└── volumes
├── config
├── custom_apps
├── data
├── html
└── theme

12 directories, 4 files

  • nginx挂载目录结构如下
1
2
3
4
5
6
7
8
9
10
11
12
13
nextcloud$ tree nginx/
nginx/
├── certs
│   ├── cloud.domain.im.crt
│   └── cloud.domain.im.key
├── nginx.conf
├── sites-enabled
│   └── nextcloud.conf
└── ssl
└── dhparam.pem

3 directories, 5 files

  • nginx全局主配置文件.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    nextcloud$ cat nginx/nginx.conf
    user nginx;
    worker_processes 1;
    error_log /var/log/nginx/error.log warn;
    pid /var/run/nginx.pid;
    events {
    worker_connections 2048;
    multi_accept on;
    }
    error_log syslog:server=unix:/dev/log,facility=local6,tag=nginx,severity=error;
    http {
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"';
    access_log /var/log/nginx/access.log main;
    index index.html index.htm;
    charset utf-8;
    server_tokens off;
    autoindex off;
    client_max_body_size 512m;
    include mime.types;
    default_type application/octet-stream;
    sendfile on;
    sendfile_max_chunk 51200k;
    tcp_nopush on;
    tcp_nodelay on;
    open_file_cache max=1000 inactive=20s;
    open_file_cache_valid 30s;
    open_file_cache_min_uses 2;
    open_file_cache_errors off;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_session_tickets off;
    ssl_session_cache shared:SSL:50m;
    ssl_session_timeout 10m;
    ssl_stapling off;
    ssl_stapling_verify off;
    resolver 8.8.8.8 8.8.4.4; # replace with `127.0.0.1` if you have a local dns server
    ssl_prefer_server_ciphers on;
    ssl_dhparam ssl/dhparam.pem; # openssl dhparam -out ssl/dhparam.pem 4096
    gzip on;
    gzip_disable msie6;
    gzip_vary on;
    gzip_proxied any;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
    include conf.d/*.conf;
    include sites-enabled/*.conf;
    }

  • nextcloud站点代理配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
nextcloud$ cat nginx/sites-enabled/nextcloud.conf
server {
listen 80;
listen 443 ssl http2;
listen [::]:443 ssl http2;

server_name cloud.domain.im;

client_max_body_size 0;
underscores_in_headers on;

location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
add_header Front-End-Https on;

proxy_headers_hash_max_size 512;
proxy_headers_hash_bucket_size 64;

proxy_buffering off;
proxy_redirect off;
proxy_max_temp_file_size 0;
proxy_pass http://cloud.domain.im:8880;
}

# static files
location ^~ /browser {
proxy_pass https://cloud.domain.im:9980;
proxy_set_header Host $http_host;
}

# WOPI discovery URL
location ^~ /hosting/discovery {
proxy_pass https://cloud.domain.im:9980;
proxy_set_header Host $http_host;
}

# Capabilities
location ^~ /hosting/capabilities {
proxy_pass https://cloud.domain.im:9980;
proxy_set_header Host $http_host;
}

# main websocket
location ~ ^/cool/(.*)/ws$ {
proxy_pass https://cloud.domain.im:9980;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $http_host;
proxy_read_timeout 36000s;
}

# download, presentation and image upload
location ~ ^/(c|l)ool {
proxy_pass https://cloud.domain.im:9980;
proxy_set_header Host $http_host;
}

# Admin Console websocket
location ^~ /cool/adminws {
proxy_pass https://cloud.domain.im:9980;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $http_host;
proxy_read_timeout 36000s;
}

ssl_certificate /etc/nginx/certs/cloud.domain.im.crt;
ssl_certificate_key /etc/nginx/certs/cloud.domain.im.key;
}

  • 这里用一键openssl脚本,创建自签名的证书.
1
2
3
4
5
6
7
8
9
10
nextcloud$ cat create_self-sign-tls.sh
DOMAIN=$1
[[ -e nginx/ssl ]] && mkdir -pv nginx/ssl
DHP=./nginx/ssl/dhparam.pem
openssl dhparam -out $DHP 2048
[[ -e nginx/certs ]] && mkdir -pv nginx/certs
KEY=./nginx/certs/${DOMAIN}.key
CRT=./nginx/certs/${DOMAIN}.crt
openssl req -new -newkey rsa:4096 -days 3650 -nodes -x509 -subj "/C=US/ST=NC/L=Local/O=Dev/CN=${DOMAIN}" -keyout $KEY -out $CRT

  • 一键创建自签名TLS证书脚本
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
#!/bin/bash

#
# If you using frp via IP address and not hostname, make sure to set the appropriate IP address in the Subject Alternative Name (SAN) area when generating SSL/TLS Certificates.

DAYS=5000
export DOMAIN=$1
if [ ! -d "$DOMAIN" ]; then
echo "not domain name, default : example.com";
DOMAIN="example.com"
fi
rm -rf $DOMAIN
[ ! -d $DOMAIN ] && mkdir $DOMAIN && cd $DOMAIN

cat > my-openssl.cnf << EOF
[ ca ]
default_ca = CA_default
[ CA_default ]
x509_extensions = usr_cert
[ req ]
default_bits = 2048
default_md = sha256
default_keyfile = privkey.pem
distinguished_name = req_distinguished_name
attributes = req_attributes
x509_extensions = v3_ca
string_mask = utf8only
prompt = no
[ req_distinguished_name ]
C = US
ST = VA
L = SomeCity
O = MyCompany
OU = MyDivision
CN = ${DOMAIN}
[ req_attributes ]
[ usr_cert ]
basicConstraints = CA:FALSE
nsComment = "OpenSSL Generated Certificate"
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer
[ v3_ca ]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints = CA:true
EOF


# build ca certificates

echo "---> build ca certificates"
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=${DOMAIN}" -days ${DAYS} -out ca.crt


# build frps server side certificates

mkdir server
echo "---> build frps server side certificates"
openssl genrsa -out server/server.key 2048

openssl req -new -sha256 -key server/server.key -out server/server.csr -config my-openssl.cnf

openssl x509 -req -days ${DAYS} \
-in server/server.csr -CA ca.crt -CAkey ca.key -CAcreateserial \
-extfile <(printf "subjectAltName=IP:127.0.0.1") \
-out server/server.crt

# build frpc client side certificates

echo "---> build frpc client side certificates"
mkdir client
openssl genrsa -out client/client.key 2048
openssl req -new -sha256 -key client/client.key -out client/client.csr -config my-openssl.cnf

openssl x509 -req -days ${DAYS} \
-in client/client.csr -CA ca.crt -CAkey ca.key -CAcreateserial \
-extfile <(printf "subjectAltName=IP:127.0.0.1") \
-out client/client.crt

cp ca.crt server/ca.crt
mv ca.crt client/ca.crt
rm ca.key ca.srl client/client.csr server/server.csr

echo "create Certificates done!!!!"
echo "verify the server Certificates"

cd server
openssl verify -CAfile ca.crt server.crt
cd ../client
openssl verify -CAfile ca.crt client.crt
chmod 644 client/client.key server/server.key

  • 最终docker-compose的文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
nextcloud$ cat docker-compose.yml
version: "3"
services:
nginx:
image: nginx:latest
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf
- ./nginx/ssl/dhparam.pem:/etc/nginx/ssl/dhparam.pem
- ./nginx/certs:/etc/nginx/certs
- ./nginx/sites-enabled:/etc/nginx/sites-enabled
- /etc/hosts:/etc/hosts:ro
ports:
- 80:80
- 443:443
environment:
- SITE_DOMAIN=cloud.domain.im
depends_on:
- nextcloud

pg11:
image: postgres:11-alpine
environment:
POSTGRES_DB: db3
POSTGRES_PASSWORD: rocks
POSTGRES_USER: cloudjs
logging:
driver: "none"
restart: unless-stopped
volumes:
- ./db-data:/var/lib/postgresql/data

collabora:
image: collabora/code:latest
restart: always
environment:
- username=admin
- password=admin
- cert_domain=cloud.domain.im
- "extra_params=--o:ssl.enable=true --o:net.post_allow.host=192.168.1.100 --o:storage.wopi.host=192.168.1.100 --o:ssl.termination=true"
ports:
- "9980:9980"
cap_add:
- MKNOD
volumes:
- /etc/hosts:/etc/hosts:ro
- ./collabora/coolwsd.xml:/etc/coolwsd/coolwsd.xml

nextcloud:
image: nextcloud
restart: unless-stopped
depends_on:
- pg11
cap_add:
- MKNOD
volumes:
- ./volumes/html:/var/www/html
- ./volumes/config:/var/www/html/config
- ./volumes/custom_apps:/var/www/html/custom_apps
- ./volumes/data:/var/www/html/data
- ./volumes/theme:/var/www/html/themes
- /etc/localtime:/etc/localtime:ro
- /etc/hosts:/etc/hosts:ro
ports:
- 8880:80

  • 如上面的文件所示,还需要修改collabora/code内默认的配置文件,这里是先从运行容器中复制一份源文件出来,修改后再用挂载的方式,覆盖容器内的配置.
1
2
nextcloud$ docker cp <container ID>:/etc/coolwsd/coolwsd.xml ./collabora/coolwsd.xml
nextcloud$ chmod 755 ./collabora/coolwsd.xml
  • 打开coolwsd.xml,并找到server_name这个节点,设置值为:cloud.domain.im. 具体如下:
1
2
3
4
nextcloud$ cat collabora/coolwsd.xml
[...]
<server_name desc="External hostname:port of the server running coolwsd. If empty, it's derived from the request (please set it if this doesn't work). May be specified when behind a reverse-proxy or when the hostname is not reachable directly." type="string" default="">cloud.domain.im</server_name>
[...]
  • 在测试时发,如果没有上述coolwsd.xml的修改,就无法使Collabora Online打开文件编辑,日志中报出如下错误.

    1
    wsd-00001-00039 2022-04-22 15:52:33.479492 +0000 [ websrv_poll ] ERR  #29 Error while handling poll at 0 in websrv_poll: #29BIO error: 0, rc: -1: error:00000000:lib(0):func(0):reason(0):| net/Socket.cpp:467
  • 因为是使用nginxHTTPS代理,还需要对nextcloud/config/config.php修改,加入'overwriteprotocol' => 'https',,重定项跳转.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
nextcloud$ sudo cat volumes/config/config.php
[...]
'overwriteprotocol' => 'https',
'datadirectory' => '/var/www/html/data',
'dbtype' => 'pgsql',
'version' => '23.0.3.2',
'overwrite.cli.url' => 'http://cloud.domain.im',
'dbname' => 'db3',
'dbhost' => 'pg11',
'dbport' => '',
'dbtableprefix' => 'oc_',
'dbuser' => 'oc_lcy',
'appstoreenabled' => true,
'appstoreurl' => 'https://apps.nextcloud.com/api/v1',
'installed' => true,
);

Docker安装

  • nextcloud/docker
  • 这里是个人使用,所以对数据库没有什么特殊要求,就使用sqlite,但是安全方面还是要使用HTTPS,使用docker安装.
1
2
3
4
5
6
7
8
9
10
11
12
~$ docker images | grep "nextcloud" || docker pull nextcloud
~$ if [ ! -d nextcloud ]; then
mkdir -pv nextcloud/{nextcloud,config,custom_apps,data,theme}
fi
~$ cd nextcloud && docker run -d --restart=always --name nextcloud -p 8443:443 \
-v nextcloud:/var/www/html \
-v config:/var/www/html/config \
-v custom_apps:/var/www/html/custom_apps \
-v data:/var/www/html/data \
-v theme:/var/www/html/themes \
-v /etc/localtime:/etc/localtime:ro \
nextcloud
  • 下面是docker安装运行sqlite数据库,通过cloudflare tunnel对外提供访问,没有配置nginx之类的代理。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
<?php
$CONFIG = array (
'htaccess.RewriteBase' => '/',
'memcache.local' => '\\OC\\Memcache\\APCu',
'apps_paths' =>
array (
0 =>
array (
'path' => '/var/www/html/apps',
'url' => '/apps',
'writable' => false,
),
1 =>
array (
'path' => '/var/www/html/custom_apps',
'url' => '/custom_apps',
'writable' => true,
),
),
'instanceid' => 'xxxxxx',
'passwordsalt' => 'xxxxxx',
'secret' => 'xxxxxx',
'trusted_domains' =>
array (
0 => '192.168.1.182',
1 => 'mycloud.example.com',
),
'datadirectory' => '/var/www/html/data',
'dbtype' => 'sqlite3',
'version' => '26.0.2.1',
'overwrite.cli.url' => 'https://mycloud.example.com/',
'overwritehost' => '',
'overwriteprotocol' => 'https',
'installed' => true,
);

  • 需要配置的参数有以下个:
    • trusted_domains:
    • overwrite.cli.url: https://mycloud.example.com/
    • overwritehost: ‘’,
    • overwriteprotocol: ‘https’, 因为tunnel对外提供的访问是https.

Dokku安装

安装Plugins

1
2
~$ sudo dokku plugin:install https://github.com/dokku/dokku-letsencrypt.git
~$ sudo dokku plugin:install https://github.com/dokku/dokku-postgres.git postgres

安装

1
2
3
4
5
6
7
8
9
10
11
12
13
~$ dokku apps:create mycloud
~$ docker pull nextcloud
~$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nextcloud latest 7aa569922593 11 hours ago 835MB

# Note: The image must be retagged `dokku/<app-name>:<version>`
~$ docker tag nextcloud:latest dokku/mycloud:latest

~$ sudo mkdir -p /var/lib/dokku/data/storage/mycloud
~$ sudo chown -R dokku:dokku /var/lib/dokku/data/storage/mycloud
~$ dokku storage:mount mycloud /var/lib/dokku/data/storage/mycloud:/var/www/html
~$ dokku tags:deploy mycloud latest
  • 链接数据库,运行下面操作后,打开浏览器,进入首次配置管理员与数据库类型(Storage & database)的配置,默认是SQLite.

    1
    2
    ~$ dokku postgres:create mycloud_db
    ~$ dokku postgres:link mycloud_db mycloud
  • 设置域名,域名是主要是为了使用Let's Encrypt,还有就是SNI的功能,这里没有注册域名,使用了dynv6.net的动态域名.

    1
    2
    3
    4
    ~$ dokku domains:add mycloud  nc.llccyy.dynv6.net
    ~$ dokku domains:remove mycloud mycloud.localhost
    ~$ dokku config:set mycloud --no-restart DOKKU_LETSENCRYPT_EMAIL=yjdwbj@gmail.com
    ~$ dokku letsencrypt mycloud
  • 获取证书错误:

1
2
3
4
5
6
7
8
9
10
darkhttpd/1.12, copyright (c) 2003-2016 Emil Mikulic.
listening on: http://0.0.0.0:80/
2021-02-28 08:15:08,915:INFO:__main__:1406: Generating new certificate private key
2021-02-28 08:15:32,772:ERROR:__main__:1388: CA marked some of the authorizations as invalid, which likely means it could not access http://example.com/.well-known/acme-challenge/X. Did you set correct path in -d example.com:path or --default_root? Are all your domains accessible from the internet? Please check your domains' DNS entries, your host's network/firewall setup and your webserver config. If a domain's DNS entry has both A and AAAA fields set up, some CAs such as Let's Encrypt will perform the challenge validation over IPv6. If your DNS provider does not answer correctly to CAA records request, Let's Encrypt won't issue a certificate for your domain (see https://letsencrypt.org/docs/caa/). Failing authorizations: https://acme-v02.api.letsencrypt.org/acme/authz-v3/11204345532
Challenge validation has failed, see error log.

Debugging tips: -v improves output verbosity. Help is available under --help.
-----> Certificate retrieval failed!
-----> Disabling ACME proxy for fpm...
done
  • 出现上面的问题,基本就是DNS A记录的问题,不能解析域名,也有可能是域名服务商的问题,在使用dynv6.com的服务是有碰到这种问题,有时不作任改动,过一段时间重试又可以了.也可以尝试使用其它的DDNS.
  • dokku中使用Let's Encrypt获取证书,有几点要求注意的点:
    • 必须要有一个域名(有A或AAAA记录),二级动态域名也可.
    • DOKKU_DOCKERFILE_PORTS: 80/tcp,必须是80端口.如:nextcloud:latest80/tcp,nextcloud:nextcloud:fpm-alpine它是9000/tcp,而且nextcloud:fpm-alpine only a php fpm instance without a web server.所以,使用nextcloud:latest创建的应用是可以正确的获取证书.
    • 如果DOKKU_DOCKERFILE_PORTS不是80端口,可以使用dokku proxy:ports-set <APP> http:80:9000先设置要代理,再申请获取证书.
  • 如果是从Docker Images去创建的app,也就是用dokku tags:deploy <appname> latest部署的,可以使用下面docker命令查看它所支持的资源
    Exposed的详情.
1
~$ docker inspect <appname>.web.1 | jq ".[0].Config.ExposedPorts"
  • 或者直接查看镜像的配置详情
    1
    ~$ docker image inspect <image tag> | jq '.[0].Config.ExposedPorts'
  • 设置上传文件限制
  • nginx configuration
1
2
~$ dokku nginx:set mycloud  client-max-body-size 100m
~$ dokku nginx:show-config mycloud

影音视频

jellyfin媒体中心

1
2
3
~$ docker pull jellyfin/jellyfin
~$ docker run -d -p 8096:8096 --name jellyfin -v `pwd`/jellyfin/config:/config -v `pwd`/jellyfin/cache:/cache -v `pwd`/Incoming:/media --restart=unless-stopped docker.io/jellyfin/jellyfin:latest

KODI(XMBC)

  • kodi

  • kodi非常强支持非常的多的系统平台,如果是小米电视2,因为它是深度定制的android 4.3,所以最高版本只能安装kodi-16.1-Jarvis-armeabi-v7a.apk,这版本,很多内置的Add-ons已经失效,且有很多的插件也是失效,比如: jellyfin-kodi的插件就是无法使用,但是用它与minidlna配合还是很好的,只少比SMBFS的体验要好。

本地收藏夹服务

PDF转文本OCR

poppler-utils

  • 下面是以一个日文说明书为例,pdftotext无法转码,显示乱码,pdftohtml显示版权问题。
1
2
3
4
~$ sudo apt-get install poppler-utils
~$ pdftotext ~/Downloads/SDFA.pdf target.txt
~$ pdftohtml ~/Downloads/SDFA.pdf target.html
Permission Error: Copying of text from this document is not allowed.

ocrmypdf

  • ocrmypdf也是不能转换加密后的PDF
1
2
3
4
5
6
7
8
9
10
11
~$ sudo apt-get install ocrmypdf
~$ ocrmypdf ~/Downloads/SDFA.pdf -l jpn test.txt
EncryptedPdfError: Input PDF is encrypted. The encryption must be removed to
perform OCR.

For information about this PDF\'s security use
qpdf --show-encryption infilename

You can remove the encryption using
qpdf --decrypt [--password=[password]] infilename

Tesseract

1
2
3
4
5
6
7
8
9
10
11
~$ dpkg -l | grep "tesseract"
ii libtesseract-dev:amd64 4.1.1-2.1 amd64 Development files for the tesseract command line OCR tool
ii libtesseract4:amd64 4.1.1-2.1 amd64 Tesseract OCR library
ii tesseract-ocr 4.1.1-2.1 amd64 Tesseract command line OCR tool
ii tesseract-ocr-chi-sim 1:4.00~git30-7274cfa-1.1 all tesseract-ocr language files for Chinese - Simplified
ii tesseract-ocr-cym 1:4.00~git30-7274cfa-1.1 all tesseract-ocr language files for Welsh
ii tesseract-ocr-dev 3.04.01-5 all transitional dummy package
ii tesseract-ocr-eng 1:4.00~git30-7274cfa-1.1 all tesseract-ocr language files for English
ii tesseract-ocr-equ 3.04.00-1 all tesseract-ocr language files for equations
ii tesseract-ocr-jpn 1:4.00~git30-7274cfa-1.1 all tesseract-ocr language files for Japanese
ii tesseract-ocr-osd 1:4.00~git30-7274cfa-1.1 all tesseract-ocr language files for script and orientation
  • Tesseract不支持对PDF文件进行识别

    1
    2
    3
    4
    5
    6
    ~$ tesseract ~/Downloads/SDFA.pdf ttt --dpi 150
    Tesseract Open Source OCR Engine v4.1.1 with Leptonica
    Error in pixReadStream: Pdf reading is not supported
    Error in pixRead: pix not read
    Error during processing.

  • 先把它转成一张张png,

1
2
3
~$ pdftoppm -png ~/Downloads/SDFA.pdf turing
~$ ls turing-*.png
turing-1.png turing-2.png turing-3.png turing-4.png turing-5.png
  • 安装目标语言包

    1
    2
    3
    4
    5
    6
    7
    8
    9
    ~$ tesseract  --list-langs
    List of available languages (5):
    chi_sim
    cym
    eng
    jpn
    osd

    ~$ tesseract turing-2.png turing -l jpn --dpi 150
  • 输出turing.txt的文本文件。

Frog

离线wiki

Kiwix

1
~$ sudo apt-get install libxapian-dev libpugixml-dev
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
~$ git clone https://github.com/kiwix/libkiwix
~$ cd libkiwix
~$ cat >build.patch<<EOF
index fca77ec..5111206 100644
--- a/meson.build
+++ b/meson.build
@@ -28,9 +28,9 @@ zlib_dep = dependency('zlib', static:static_deps)
xapian_dep = dependency('xapian-core', static:static_deps)

if compiler.has_header('mustache.hpp')
- extra_include = []
-elif compiler.has_header('mustache.hpp', args: '-I/usr/include/kainjow')
- extra_include = ['/usr/include/kainjow']
+ extra_include = ['/home/michael/3TB-DISK/github/kiwix/Mustache']
+elif compiler.has_header('mustache.hpp', args: '-I/home/michael/3TB-DISK/github/kiwix/Mustache')
+ extra_include = ['/home/michael/3TB-DISK/github/kiwix/Mustache']
else
error('Cannot found header mustache.hpp')
endif
EOF
~$ meson . build && ninja -C build install
1
2
3
~$ git clone https://github.com/openzim/libzim
~$ cd libzim
~$ meson . build && ninja -C build install
1
2
3
~$ git clone https://github.com/kiwix/kiwix-desktop
~$ cd kiwix-desktop
~$ qmake && make -j10 install
  • 直接编译成deb安装包
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
~$ cat > rules.patch <<EOF
index f023663..f4f2877 100755
--- a/debian/rules
+++ b/debian/rules
@@ -4,3 +4,7 @@ export DEB_BUILD_MAINT_OPTIONS = hardening=+all

%:
dh $@
+
+override_dh_shlibdeps:
+ dh_shlibdeps --dpkg-shlibdeps-params=--ignore-missing-info

EOF
~ kiwix-desktop$ git apply rules.path
~ kiwix-desktop$ dpkg-buildpackage -us -uc

搭建Matrix服务

1
2
3
~$ mkdir matrix
~$ docker network create --driver=bridge --subnet=10.10.10.0/24 --gateway=10.10.10.1 matrix_net
~$ cd matrix
  • 如果只是测试,或者VPS资源有限,使用sqlite3就可以,注释掉postgres的一项。

  • docker-compose.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
version: '3.8'
services:
postgres:
image: postgres:11-alpine
restart: unless-stopped
networks:
default:
ipv4_address: 10.10.10.2
volumes:
- ./postgresdata:/var/lib/postgresql/data

# These will be used in homeserver.yaml later on
environment:
- POSTGRES_DB=synapse
- POSTGRES_USER=synapse
- POSTGRES_PASSWORD=STRONGPASSWORD
- POSTGRES_INITDB_ARGS=--lc-collate C --lc-ctype C --encoding UTF8

element:
image: vectorim/element-web:latest
restart: unless-stopped
volumes:
- ./element-config.json:/app/config.json
networks:
default:
ipv4_address: 10.10.10.3

synapse:
image: matrixdotorg/synapse:latest
restart: unless-stopped
networks:
default:
ipv4_address: 10.10.10.4
volumes:
- ./synapse:/data

networks:
default:
external:
name: matrix_net

  • 下载element的模版本配置文件,并且删除"default_server_name": "matrix.org"这一行。
1
2
matrix$ wget https://develop.element.io/config.json
matrix$ mv config.json element-config.json

生成Synapse配置文件

1
2
3
4
5
6
matrix$ mkdir synapse
matrix$ docker run -it --rm \
-v "$(PWD)/synapse:/data" \
-e SYNAPSE_SERVER_NAME=matrix.example.com \
-e SYNAPSE_REPORT_STATS=yes \
matrixdotorg/synapse:latest generate
  • Synapse默认是使用sqlite3的,如果是要使用postgres,需要把synapse/homeserver.yaml里的database设置如下:
1
2
3
4
5
6
7
8
9
database:
name: psycopg2
args:
user: synapse
password: STRONGPASSWORD
database: synapse
host: postgres
cp_min: 5
cp_max: 10
  • 创建新的用户
1
2
3
4
5
6
7
8
9
10
matrix$ docker-compose up -d

matrix$ docker exec -it matrix_synapse_1 bash
~# register_new_matrix_user -c /data/homeserver.yaml http://localhost:8008
New user localpart [root]: ruan
Password:
Confirm password:
Make admin [no]: yes
Sending registration request...
Success!

caddy V2代理

1
2
3
4
5
$ apt install -y debian-keyring debian-archive-keyring apt-transport-https
$ curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo tee /etc/apt/trusted.gpg.d/caddy-stable.asc
$ curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list
$ apt update
$ apt install caddy -y
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
matrix$ cat Caddyfile
http://matrix.example.com {
tls internal
reverse_proxy /_matrix/* 10.10.10.4:8008
reverse_proxy /_synapse/client/* 10.10.10.4:8008
log {
output file /var/log/caddy/matrix.example.log
}
header {
X-Content-Type-Options nosniff
Referrer-Policy strict-origin-when-cross-origin
Strict-Transport-Security "max-age=63072000; includeSubDomains;"
Permissions-Policy "accelerometer=(), camera=(), geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=(), interest-cohort=()"
X-Frame-Options SAMEORIGIN
X-XSS-Protection 1
X-Robots-Tag none
-server
}
}

http://element.example.com {
tls internal
encode zstd gzip
reverse_proxy 10.10.10.3:80

log {
output file /var/log/caddy/element.example.log
}
header {
X-Content-Type-Options nosniff
Referrer-Policy strict-origin-when-cross-origin
Strict-Transport-Security "max-age=63072000; includeSubDomains;"
Permissions-Policy "accelerometer=(), camera=(), geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=(), interest-cohort=()"
X-Frame-Options SAMEORIGIN
X-XSS-Protection 1
X-Robots-Tag none
-server
}
}

  • 加入到系统的配置运行
1
2
3
4
matrix$ caddy adapt --config /path/path/Caddyfile
matrix$ caddy fmt
matrix$ caddy reload

  • 注意:Caddyfile内如果只写 <domain.com> {} 的格式,它就会自动转换http -> https,这对于内网测试,本地浏览器测试会出现Mixed content blocked这样的错误.

移动APP的连接

  • 进入网页端的管理后台https://<domain>/settings/user/security. 在Devices & sessions下面点击Create new app password生成一个动态的用户与密码,再点Show QR code from mobile apps,打开移动端nextcloud选择连接自建服务器,扫描连接.
  • 如果碰到问题,可以通过查看https://<domain>/settings/admin/logging.

MinIO

容器运行

1
2
3
4
5
~$ sudo apt-get install podman -y
~$ podman run \
-p 9000:9000 \
-p 9001:9001 \
minio/minio server /data --console-address ":9001"
  • 如果出现下面的错误,请确认/etc/containers/registries.conf里有这一行unqualified-search-registries=["docker.io"],再重试.

    1
    Error: error getting default registries to try: short-name "minio/minio" did not resolve to an alias and no unqualified-search registries are defined in "/etc/containers/registries.conf"
  • 也可以docker运行

    1
    2
    3
    4
    5
    ~$ docker run -it --rm -p 9000:9000 \
    -v `pwd`/minio-data:/data \
    -e MINIO_ROOT_USER=minio \
    -e MINIO_ROOT_PASSWORD=minio123 \
    -p 9001:9001 minio/minio server /data --console-address ":9001"

直接单机运行

1
2
3
~$ wget http://dl.minio.org.cn/server/minio/release/darwin-amd64/minio
~$ chmod +x minio
~$ ./minio server /data

客户端访问

mc

1
2
3
~$ ./mc alias set myminio http://127.0.0.1:9000 minio minio123
Added `myminio` successfully.

  • 添加用户

    1
    2
    3
    4
    5
    6
    7
    ~$ ./mc admin user add myminio testuser testpwd123
    Added user `testuser` successfully.
    ~$ ./mc admin user info myminio testuser
    AccessKey: testuser
    Status: enabled
    PolicyName:
    MemberOf:
  • 添加桶(bucket)

    1
    2
    ./mc mb myminio/test-new-s3-bucket
    Bucket created successfully `myminio/test-new-s3-bucket`.
  • 创建一个bucket policy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
~$ cat test-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:GetBucketLocation",
"s3:ListBucket",
"s3:ListAllMyBuckets"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::<change-this-to-your-bucket-name>/*"
],
"Sid": "Public"
}
]
}
  • 把策略文件加入到服务器中

    1
    2
    3
    ~$ ./mc admin policy add myminio test-policy test-policy.json
    Added policy `test-policy` successfully.

  • 应用策略到指定的用户上

    1
    2
    3
    4
    5
    6
    7
    8
    ~$ ./mc admin policy set myminio "test-policy" user=testuser
    Policy `test-policy` is set on user `testuser`

    ~$ ./mc admin user info myminio testuser
    AccessKey: testuser
    Status: enabled
    PolicyName: test-policy
    MemberOf:

awscli

1
~$ pip3 install awscli
1
2
3
4
5
6
~$ aws configure --profile minio
AWS Access Key ID [None]: minio
AWS Secret Access Key [None]: minio123
Default region name [None]: myminio
Default output format [None]: json

  • 创建一个bucket.

    1
    ~$ aws --profile minio --endpoint-url http://127.0.0.1:9000 s3 mb s3://new-s3-bucket
  • 列出服务器上的bucket.

    1
    2
    3
    ~$ aws --profile minio --endpoint-url http://127.0.0.1:9000 s3 ls
    2021-12-13 23:16:20 new-s3-bucket
    2021-12-13 22:58:15 test-new-s3-bucket
  • 上传一个文件

    1
    2
    ~$ aws --profile minio --endpoint-url http://127.0.0.1:9000 s3 cp test-policy.json s3://new-s3-bucket
    upload: ./test-policy.json to s3://new-s3-bucket/test-policy.json
  • 再对比一下,绑定到minio容器的本地目录minio-data的结构.

1
2
3
4
5
6
7
minio$ tree minio-data/
minio-data/
├── new-s3-bucket
│   └── test-policy.json
└── test-new-s3-bucket

2 directories, 1 file

Syncthing

1
2
~$ docker pull syncthing/syncthing
~$ docker run -p 8384:8384 -p 22000:22000 -v <YOUR PC FOLDER>/share:/var/syncthing syncthing/syncthing:latest

使用docker-compose创建

  • 这里使用docker-compose创建,并支持Traefik反向代理,暴露给外部访问,这里只是本地内部测试,未配置https与真实的域名.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    syncthing$ cat .env
    # Syncthing
    DOCKER_SYNCTHING_IMAGE_NAME=syncthing/syncthing
    DOCKER_SYNCTHING_HOSTNAME=syncthing-on-storage
    DOCKER_SYNCTHING_DOMAIN=syncthing.localhost

    # discosrv
    DOCKER_DISCOSRV_IMAGE_NAME=syncthing/discosrv
    DOCKER_DISCOSRV_HOSTNAME=discosrv-on-storage
    DOCKER_DISCOSRV_DOMAIN=discosrv.localhost

    # exporter
    DOCKER_EXPORTER_IMAGE_NAME=soulteary/syncthing-exporter
    # xxd -l 16 -p /dev/random | base64
    DOCKER_EXPORTER_API_TOKEN=OTU0NGJmMGJhYzRiNGEzM2Q3Yzc4MjhjOTdhZjJkMDAK
    DOCKER_EXPORTER_HOSTNAME=syncthing-exporter-on-storage
    DOCKER_EXPORTER_DOMAIN=syncthing-exporter.localhost

  • docker-compose.yml文件,需要当前目录里的.env文件配合.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
syncthing$ cat docker-compose.yml

version: "3"

services:
syncthing:
image: ${DOCKER_SYNCTHING_IMAGE_NAME}
container_name: ${DOCKER_SYNCTHING_HOSTNAME}
hostname: ${DOCKER_SYNCTHING_HOSTNAME}
environment:
- PUID=1000
- PGID=1000
volumes:
- ./data:/var/syncthing
ports:
- "22000:22000"
restart: always
labels:
- "traefik.enable=true"
- "traefik.docker.network=traefik_default"
- "traefik.http.routers.sync-http.entrypoints=http"
- "traefik.http.routers.sync-http.rule=Host(`${DOCKER_SYNCTHING_DOMAIN}`)"
- "traefik.http.routers.sync-http.service=sync-backend"
- "traefik.http.services.sync-backend.loadbalancer.server.scheme=http"
- "traefik.http.services.sync-backend.loadbalancer.server.port=8384"
networks:
- traefik_default
logging:
driver: "json-file"
options:
max-size: "1m"

networks:
traefik_default:
external: true
  • 通过浏览器打开http://127.0.0.1:8384访问控制台,移动端可以安装syncthing客户端进行连接.

源码编译,安装systemd服务运行

1
2
3
4
5
6
7
8
~$ git clone https://github.com/syncthing/syncthing
~$ cd syncthing && ./build.sh


~$ sudo cp bin/syncthing /usr/bin/
~$ sudo cp etc/linux-systemd/user/syncthing.service /etc/systemd/user/
~$ systemctl --user --now enable syncthing
~$ systemctl start syncthing

使用docker-compose安装mysql+phpadmin

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
version: '3'

services:
db:
image: mysql
container_name: zlib-db
environment:
MYSQL_ROOT_PASSWORD: 123456
MYSQL_DATABASE: zlib-db
MYSQL_USER: zlib
MYSQL_PASSWORD: zlib123
ports:
- "3306:3306"
volumes:
- ./db-data:/var/lib/mysql
phpmyadmin:
image: phpmyadmin
container_name: pma
links:
- db
environment:
PMA_HOST: zlib-db
PMA_PORT: 3306
PMA_ARBITRARY: 1
restart: always
ports:
- 8081:80
volumes:
dbdata:

协作文档

CryptPad

  • xwiki-labs/cryptpad
    1
    2
    3
    4
    ~$ sudo dokku apps:create cryptpad
    ~$ docker tag promasu/cryptpad:latest dokku/cryptpad:latest
    ~$ sudo dokku tags:deploy cryptpad latest
    ~$ sudo dokku domains:add cryptpad cryptpad.llccyy.dynv6.net

wiki.js

docker-compose安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
wiki.js$ cat docker-compose.yml
version: "3"
services:

db:
image: postgres:11-alpine
environment:
POSTGRES_DB: wiki
POSTGRES_PASSWORD: wikijsrocks
POSTGRES_USER: wikijs
logging:
driver: "none"
restart: unless-stopped
networks:
- traefik_default
volumes:
- ./db-data:/var/lib/postgresql/data

wiki:
image: requarks/wiki:2
depends_on:
- db
networks:
- traefik_default
environment:
DB_TYPE: postgres
DB_HOST: db
DB_PORT: 5432
DB_USER: wikijs
DB_PASS: wikijsrocks
DB_NAME: wiki
restart: unless-stopped
expose:
- 3000
labels:
- traefik.enable=true
- traefik.docker.network=traefik_default
- traefik.http.routers.wiki.rule=Host(`wiki.localhost`)
- traefik.http.routers.wiki.entrypoints=http
- traefik.http.services.wiki.loadbalancer.server.port=3000

networks:
traefik_default:
external: true
  • 如上面如示,docker-compose.yml是开启了Traefik反向代理的,启动后,打开http://wiki.localhost/就可以设置安装wiki.js向导页了.

配置TLS与域名

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
cat docker-compose-wikijs.yml
version: "3"
services:

db:
image: postgres:11-alpine
environment:
POSTGRES_DB: wiki
POSTGRES_PASSWORD: wikijsrocks
POSTGRES_USER: wikijs
logging:
driver: "none"
restart: unless-stopped
networks:
- traefik_default
volumes:
- ./db-data:/var/lib/postgresql/data

wiki:
image: requarks/wiki:2
depends_on:
- db
networks:
- traefik_default
environment:
DB_TYPE: postgres
DB_HOST: db
DB_PORT: 5432
DB_USER: wikijs
DB_PASS: wikijsrocks
DB_NAME: wiki
restart: unless-stopped
expose:
- 3000
labels:
- traefik.enable=true
- traefik.docker.network=traefik_default
- traefik.http.routers.wiki.rule=Host(`<your full domain name>`) && (PathPrefix(`/wiki`) || PathPrefix(`/_assets`))
- traefik.http.routers.wiki.entrypoints=websecure
- traefik.http.routers.wiki.tls.certresolver=myresolver
- traefik.http.services.wiki.loadbalancer.server.port=3000

networks:
traefik_default:
external: true
  • 如上面所示,entrypoints=websecure,tls.certresolver=myresolver这两个指项的定义是在traefik启动命令行中定义的,而且路由规则Rule必须是匹配/wiki/_assets的前缀.traefix的启动命令大概如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[...]
traefik:
image: "traefik:v2.5"
container_name: "traefik"
command:
- "--api.insecure=false"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--providers.file.directory=/letsencrypt/"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.myresolver.acme.httpchallenge=true"
- "--certificatesresolvers.myresolver.acme.httpchallenge.entrypoint=web"
#- "--certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
- "--certificatesresolvers.myresolver.acme.email=<your email>@gmail.com"
- "--certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json"
[...]
  • 第一次启动访问wiki页面时,会是一个初始化安装页,在SITE URL必须填写https://<your full domain>/wiki.

AI/ML

语音控制

DeepSpeech

  • DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

MycroftAI/mimic3

  • A fast and local neural text to speech system developed by Mycroft for the Mark II.

(MycroftAI/mycroft-core)[https://github.com/MycroftAI/mycroft-core]

  • Mycroft is a hackable open source voice assistant.

通信类

Matrix

服务端

Dendrite

  • dendrite

  • INSTALL.md

  • 这里主要根据官方的文档,通过docker快速部署一个服务实践.

1
~$ git clone https://github.com/matrix-org/dendrite

客户端

项目管理

OpenProject

自动化测试

RobotFramework

Trojan

  • trojandokku共用 443端口,4层转发.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
~$ nginx  -T
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
# configuration file /etc/nginx/nginx.conf:
user www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;

events {
worker_connections 768;
# multi_accept on;
}

http {

##
# Basic Settings
##

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;

# server_names_hash_bucket_size 64;
# server_name_in_redirect off;

include /etc/nginx/mime.types;
default_type application/octet-stream;

##
# SSL Settings
##

ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
ssl_prefer_server_ciphers on;

##
# Logging Settings
##

access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;

##
# Gzip Settings
##

gzip on;

# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

##
# Virtual Host Configs
##

include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}


#mail {
# # See sample authentication script at:
# # http://wiki.nginx.org/ImapAuthenticateWithApachePhpScript
#
# # auth_http localhost/auth.php;
# # pop3_capabilities "TOP" "USER";
# # imap_capabilities "IMAP4rev1" "UIDPLUS";
#
# server {
# listen localhost:110;
# protocol pop3;
# proxy on;
# }
#
# server {
# listen localhost:143;
# protocol imap;
# proxy on;
# }
#}

# https://raymii.org/s/tutorials/nginx_1.15.2_ssl_preread_protocol_multiplex_https_and_ssh_on_the_same_port.html
stream {
map $ssl_preread_server_name $backend_name {
proxy.yjdwbj.cloudns.org trojan;
nc.llccyy.dynv6.net mycloud;
fpm.yjdwbj.cloudns.org fpm;
default dokku;
}
upstream dokku {
server 127.0.0.1:2000;
}
upstream fpm
{
server 127.0.0.1:6443;
}

upstream trojan {
server 172.17.0.2:443;
}

upstream mycloud {

server 127.0.0.1:5443;
}


server {
listen 443 reuseport ;
listen [::]:443 reuseport;
proxy_pass $backend_name;
#proxy_protocol on;
ssl_preread on;
}
}

# configuration file /etc/nginx/modules-enabled/50-mod-http-auth-pam.conf:
load_module modules/ngx_http_auth_pam_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-dav-ext.conf:
load_module modules/ngx_http_dav_ext_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-echo.conf:
load_module modules/ngx_http_echo_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-geoip.conf:
load_module modules/ngx_http_geoip_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-image-filter.conf:
load_module modules/ngx_http_image_filter_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-subs-filter.conf:
load_module modules/ngx_http_subs_filter_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-upstream-fair.conf:
load_module modules/ngx_http_upstream_fair_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-xslt-filter.conf:
load_module modules/ngx_http_xslt_filter_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-mail.conf:
load_module modules/ngx_mail_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-stream.conf:
load_module modules/ngx_stream_module.so;

# configuration file /etc/nginx/mime.types:

types {
text/html html htm shtml;
text/css css;
text/xml xml;
image/gif gif;
image/jpeg jpeg jpg;
application/javascript js;
application/atom+xml atom;
application/rss+xml rss;

text/mathml mml;
text/plain txt;
text/vnd.sun.j2me.app-descriptor jad;
text/vnd.wap.wml wml;
text/x-component htc;

image/png png;
image/tiff tif tiff;
image/vnd.wap.wbmp wbmp;
image/x-icon ico;
image/x-jng jng;
image/x-ms-bmp bmp;
image/svg+xml svg svgz;
image/webp webp;

application/font-woff woff;
application/java-archive jar war ear;
application/json json;
application/mac-binhex40 hqx;
application/msword doc;
application/pdf pdf;
application/postscript ps eps ai;
application/rtf rtf;
application/vnd.apple.mpegurl m3u8;
application/vnd.ms-excel xls;
application/vnd.ms-fontobject eot;
application/vnd.ms-powerpoint ppt;
application/vnd.wap.wmlc wmlc;
application/vnd.google-earth.kml+xml kml;
application/vnd.google-earth.kmz kmz;
application/x-7z-compressed 7z;
application/x-cocoa cco;
application/x-java-archive-diff jardiff;
application/x-java-jnlp-file jnlp;
application/x-makeself run;
application/x-perl pl pm;
application/x-pilot prc pdb;
application/x-rar-compressed rar;
application/x-redhat-package-manager rpm;
application/x-sea sea;
application/x-shockwave-flash swf;
application/x-stuffit sit;
application/x-tcl tcl tk;
application/x-x509-ca-cert der pem crt;
application/x-xpinstall xpi;
application/xhtml+xml xhtml;
application/xspf+xml xspf;
application/zip zip;

application/octet-stream bin exe dll;
application/octet-stream deb;
application/octet-stream dmg;
application/octet-stream iso img;
application/octet-stream msi msp msm;

application/vnd.openxmlformats-officedocument.wordprocessingml.document docx;
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet xlsx;
application/vnd.openxmlformats-officedocument.presentationml.presentation pptx;

audio/midi mid midi kar;
audio/mpeg mp3;
audio/ogg ogg;
audio/x-m4a m4a;
audio/x-realaudio ra;

video/3gpp 3gpp 3gp;
video/mp2t ts;
video/mp4 mp4;
video/mpeg mpeg mpg;
video/quicktime mov;
video/webm webm;
video/x-flv flv;
video/x-m4v m4v;
video/x-mng mng;
video/x-ms-asf asx asf;
video/x-ms-wmv wmv;
video/x-msvideo avi;
}

# configuration file /etc/nginx/conf.d/dokku-installer.conf:
upstream dokku-installer { server 127.0.0.1:2000; }
server {
listen 80;
location / {
proxy_pass http://dokku-installer;
}
}

# configuration file /etc/nginx/conf.d/dokku.conf:
include /home/dokku/*/nginx.conf;

server_tokens off;

# Settings from https://mozilla.github.io/server-side-tls/ssl-config-generator/
ssl_session_cache shared:SSL:20m;
ssl_session_timeout 1d;
ssl_session_tickets off;

ssl_dhparam /etc/nginx/dhparam.pem;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;

# configuration file /home/dokku/fpm/nginx.conf:

server {
listen [::]:80;
listen 80;
server_name fpm.yjdwbj.cloudns.org;
access_log /var/log/nginx/fpm-access.log;
error_log /var/log/nginx/fpm-error.log;

return 301 https://$host:6443$request_uri;

}

server {
listen [::]:6443 ssl http2;
listen 6443 ssl http2;

server_name fpm.yjdwbj.cloudns.org;
access_log /var/log/nginx/fpm-access.log;
error_log /var/log/nginx/fpm-error.log;

ssl_certificate /home/dokku/fpm/tls/server.crt;
ssl_certificate_key /home/dokku/fpm/tls/server.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers off;

keepalive_timeout 70;


location / {

gzip on;
gzip_min_length 1100;
gzip_buffers 4 32k;
gzip_types text/css text/javascript text/xml text/plain text/x-component application/javascript application/x-javascript application/json application/xml application/rss+xml font/truetype application/x-font-ttf font/opentype application/vnd.ms-fontobject image/svg+xml;
gzip_vary on;
gzip_comp_level 6;

proxy_pass http://fpm-80;
http2_push_preload on;
proxy_http_version 1.1;
proxy_read_timeout 60s;
proxy_buffer_size 4096;
proxy_buffering on;
proxy_buffers 8 4096;
proxy_busy_buffers_size 8192;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Request-Start $msec;

}


include /home/dokku/fpm/nginx.conf.d/*.conf;

error_page 400 401 402 403 405 406 407 408 409 410 411 412 413 414 415 416 417 418 420 422 423 424 426 428 429 431 444 449 450 451 /400-error.html;
location /400-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 404 /404-error.html;
location /404-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 500 501 503 504 505 506 507 508 509 510 511 /500-error.html;
location /500-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 502 /502-error.html;
location /502-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}
}

upstream fpm-80 {

server 172.17.0.5:80;
}


# configuration file /home/dokku/fpm/nginx.conf.d/hsts.conf:
add_header Strict-Transport-Security "max-age=15724800; includeSubdomains" always;

# configuration file /home/dokku/mycloud/nginx.conf:

server {
listen [::]:80;
listen 80;
server_name nc.llccyy.dynv6.net;
access_log /var/log/nginx/mycloud-access.log;
error_log /var/log/nginx/mycloud-error.log;

return 301 https://$host:5443$request_uri;

}

server {
listen [::]:5443 ssl http2;
listen 5443 ssl http2;

server_name nc.llccyy.dynv6.net;
access_log /var/log/nginx/mycloud-access.log;
error_log /var/log/nginx/mycloud-error.log;

ssl_certificate /home/dokku/mycloud/tls/server.crt;
ssl_certificate_key /home/dokku/mycloud/tls/server.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers off;

keepalive_timeout 70;


location / {

gzip on;
gzip_min_length 1100;
gzip_buffers 4 32k;
gzip_types text/css text/javascript text/xml text/plain text/x-component application/javascript application/x-javascript application/json application/xml application/rss+xml font/truetype application/x-font-ttf font/opentype application/vnd.ms-fontobject image/svg+xml;
gzip_vary on;
gzip_comp_level 6;

proxy_pass http://mycloud-80;
http2_push_preload on;
proxy_http_version 1.1;
proxy_read_timeout 60s;
proxy_buffer_size 4096;
proxy_buffering on;
proxy_buffers 8 4096;
proxy_busy_buffers_size 8192;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Request-Start $msec;

}

client_max_body_size 100m;
include /home/dokku/mycloud/nginx.conf.d/*.conf;

error_page 400 401 402 403 405 406 407 408 409 410 411 412 413 414 415 416 417 418 420 422 423 424 426 428 429 431 444 449 450 451 /400-error.html;
location /400-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 404 /404-error.html;
location /404-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 500 501 503 504 505 506 507 508 509 510 511 /500-error.html;
location /500-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 502 /502-error.html;
location /502-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}
}

upstream mycloud-80 {

server 172.17.0.4:80;
}


# configuration file /home/dokku/mycloud/nginx.conf.d/hsts.conf:
add_header Strict-Transport-Security "max-age=15724800; includeSubdomains" always;

# configuration file /home/dokku/trojan/nginx.conf:

server {
listen [::]:7443 ssl http2;
listen 7443 ssl http2;

server_name proxy.yjdwbj.cloudns.org;
access_log /var/log/nginx/trojan-access.log;
error_log /var/log/nginx/trojan-error.log;

ssl_certificate /home/dokku/trojan/tls/server.crt;
ssl_certificate_key /home/dokku/trojan/tls/server.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers off;

keepalive_timeout 70;


location / {

gzip on;
gzip_min_length 1100;
gzip_buffers 4 32k;
gzip_types text/css text/javascript text/xml text/plain text/x-component application/javascript application/x-javascript application/json application/xml application/rss+xml font/truetype application/x-font-ttf font/opentype application/vnd.ms-fontobject image/svg+xml;
gzip_vary on;
gzip_comp_level 6;

proxy_pass https://trojan-443;
http2_push_preload on;
proxy_http_version 1.1;
proxy_read_timeout 60s;
proxy_buffer_size 4096;
proxy_buffering on;
proxy_buffers 8 4096;
proxy_busy_buffers_size 8192;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Request-Start $msec;

}


include /home/dokku/trojan/nginx.conf.d/*.conf;

error_page 400 401 402 403 405 406 407 408 409 410 411 412 413 414 415 416 417 418 420 422 423 424 426 428 429 431 444 449 450 451 /400-error.html;
location /400-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 404 /404-error.html;
location /404-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 500 501 503 504 505 506 507 508 509 510 511 /500-error.html;
location /500-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}

error_page 502 /502-error.html;
location /502-error.html {
root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
internal;
}
}

upstream trojan-443 {

server 172.17.0.2:443;
}


# configuration file /home/dokku/trojan/nginx.conf.d/hsts.conf:
add_header Strict-Transport-Security "max-age=15724800; includeSubdomains" always;

# configuration file /etc/nginx/conf.d/server_names_hash_bucket_size.conf:
#server_names_hash_bucket_size 512;


  • 本地端口与上面对应有5443,7443,6443
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:6443 0.0.0.0:* LISTEN 23764/nginx: master
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 23764/nginx: master
tcp 0 0 0.0.0.0:2000 0.0.0.0:* LISTEN 27904/python3
tcp 0 0 0.0.0.0:7443 0.0.0.0:* LISTEN 23764/nginx: master
tcp 0 0 0.0.0.0:2132 0.0.0.0:* LISTEN 1027/sshd
tcp 0 0 0.0.0.0:53 0.0.0.0:* LISTEN 2715/dnsmasq
tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 23764/nginx: master
tcp 0 0 0.0.0.0:8123 0.0.0.0:* LISTEN 444/polipo
tcp 0 0 0.0.0.0:5443 0.0.0.0:* LISTEN 23764/nginx: master
tcp 0 0 0.0.0.0:1443 0.0.0.0:* LISTEN 1435/ss-server
tcp6 0 0 :::6443 :::* LISTEN 23764/nginx: master
tcp6 0 0 :::80 :::* LISTEN 23764/nginx: master
tcp6 0 0 :::7443 :::* LISTEN 23764/nginx: master
tcp6 0 0 :::2132 :::* LISTEN 1027/sshd
tcp6 0 0 :::53 :::* LISTEN 2715/dnsmasq
tcp6 0 0 :::443 :::* LISTEN 23764/nginx: master
tcp6 0 0 :::5443 :::* LISTEN 23764/nginx: master
tcp6 0 0 :::1443 :::* LISTEN 1435/ss-server

资源聚合

  • 100种不错的免费工具和资源
  • https://arxiv.org/
    • arXiv is a free distribution service and an open-access archive for 1,993,024 scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv.
  • 计算机工程资料索引

谢谢支持

  • 微信二维码:

数据卷(Volumes)

  • Kubernetes对于数据卷重新定义,提供了丰富的强大的功能.按照功能可分为三类:本地数据卷,网络数据卷,信息数据卷.
  • Kubernetes提供支持的数据卷类型,最新版本会有增减:
    • 本地数据卷:
      • EmptyDir
      • HostPath
    • 网络数据卷:
      • NFS
      • iSCSI
      • GlusterFS
      • RBD
      • Flocker
      • GCE Perisistent Disk
      • Aws Elastic Block Store
      • azureDisk
      • CephFS
      • fc (fibre Channel)
      • Persistent Volume Claim
    • 信息数据卷:
      • Git Repo(deprecated)
      • Secret
      • Downward API

本地数据卷

HostPath

  • 大多数的Pod应该忽略它们主机节点,因此它们不应该访问节点里文件系统上的任何文件.但是某些系统级别的Pod(通常是由DaemonSet管理)确实需要读取节点里的文件,还有在测试环境中可以HostPath来代替一些PV.HostPath卷指向节点里文件系统是的特定文件或目录.在同一个节点上运行并在其HostPath卷中使用相同路径的Pod可以看到相同的文件.如果要在集群里使用HostPath,需要把--enable-hostpath-provisioner参数标志加到kube-controller-manager里启动.

Ceph集群

概要

  • Ceph是一个开源项目,它提供软件定义的(SDS),统一的存储解决方案.具有高度可伸缩性,容量可扩展到EB级别.Ceph的技术特性,总体表现在集群的可靠性,集群扩展性,数据安全性,接口统一性4个方面.

数据卷

  • 后端存储可以分为filestorebluestore:

  • FileStore:

    • FileStore is the legacy approach to storing objects in Ceph. It relies on a standard file system (normally XFS) in combination with a key/value database (traditionally LevelDB, now RocksDB) for some metadata.
    • FileStore is well-tested and widely used in production. However, it suffers from many performance deficiencies due to its overall design and its reliance on a traditional file system for object data storage.
    • Although FileStore is capable of functioning on most POSIX-compatible file systems (including btrfs and ext4), we recommend that only the XFS file system be used with Ceph. Both btrfs and ext4 have known bugs and deficiencies and their use may lead to data loss. By default, all Ceph provisioning tools use XFS.
  • BlueStore:

    • Key BlueStore features include:
      • Direct management of storage devices. BlueStore consumes raw block devices or partitions. This avoids intervening layers of abstraction (such as local file systems like XFS) that can limit performance or add complexity.
      • Metadata management with RocksDB. RocksDB’s key/value database is embedded in order to manage internal metadata, including the mapping of object names to block locations on disk.
      • Full data and metadata checksumming. By default, all data and metadata written to BlueStore is protected by one or more checksums. No data or metadata is read from disk or returned to the user without being verified.
      • Inline compression. Data can be optionally compressed before being written to disk.
      • Multi-device metadata tiering. BlueStore allows its internal journal (write-ahead log) to be written to a separate, high-speed device (like an SSD, NVMe, or NVDIMM) for increased performance. If a significant amount of faster storage is available, internal metadata can be stored on the faster device.
      • Efficient copy-on-write. RBD and CephFS snapshots rely on a copy-on-write clone mechanism that is implemented efficiently in BlueStore. This results in efficient I/O both for regular snapshots and for erasure-coded pools (which rely on cloning to implement efficient two-phase commits).
    • 支持下面的配置:
      • A block device, a block.wal, and a block.db device
      • A block device and a block.wal device
      • A block device and a block.db device
      • A single block device
    • block device 也有三种选项:
      • 整个磁盘
      • 磁盘分区
      • 逻辑卷 (a logical volumen of LVM)
  • 注意:

    1. 不可以使用磁盘作为block.db或者block.wal,否则会报错:blkid could not detect a PARTUUID for device;
    2. 若使用磁盘或者分区作block,则ceph-volume会在其上创建LV来使用.若使用分区作block.dbblock.wal,则直接使用分区而不创建LV.
  • BlueFs将整个BlueStore的存储空间分为三个层次:

    • 慢速(Slow)空间:主要用于存储对象数据,可由普通大容量机械盘提供,由BlueStore自行管理
    • 高速(DB)空间:存储BlueStore内部产生的元数据,可由普通SSD提供,需求小于(慢速空间).
    • 超高速(WAL)空间:主要存储RocksDB内部产生的.log 文件,可由SSD或者NVRAM等时延相较普通SSD更小的设备充当.容量需求和(高速空间)相当,同样由Bluefs直接管理.

Ceph的功能组件

  • Ceph OSD:(Object Storage Device),主要功能包括存储数据,处理数据的复制,恢复,回补平衡数据分布,并将一些相关数据提供给Ceph Monitor,例如 Ceph OSD心跳等.一个Ceph的存储集群,至少需要1个Ceph OSD来实现active+clean健康状态和有效的保存数据的副本(默认情况下是双副本,可以调整).注意:每一个Disk,分区都可以成为一个OSD.
  • Ceph Monitor:Ceph的监视器,主要功能是维护整个集群健康状态,提供一致性的决策,包含了 Monitor map,OSD map,PG(Placement Group) mapCRUSH map.
  • Ceph MDS:(Ceph Metadata Server),主要保存的是Ceph文件系统的元数据(metadata).注意:Ceph的块存储与Ceph的对象存储都不需要Ceph MDS.Ceph MDS为基于POSIX文件系统的用户提供了一些基础命令,如:ls,find等.如果需要创建CephFS才需要用到MDS,但是CephFS离生产使用还有一段距离.

Ceph 功能特性

RADOS

  • RADOS具备自我修复等特性,提供了一个可靠,自动,智能的分布式存储.它的灵魂CRUSH(Controlled Replication Under Scalable Hashing,可扩展哈希算法的可控复制)算法.

Ceph文件系统

  • CephFS功能特性是基于RADOS来现实分布式的文件系统,引入了MDS(Metadata Server),主要为兼容POSIX文件系统提供元数据.一般都是当体系文件系统来挂载.
  • Ceph文件系统
    ditaa-b5a320fc160057a1a7da010b4215489fa66de242

Ceph块设备

  • RBD(Rados Block Device)功能特性是基于Librados之上,通过Librbd创建一个块设备,通过QEMU/KVM附加到VM上,作为传统的块设备来使用.目前OpenStack,CloudStack等都是采用这种方式来为VM提供块设备,同时也支持快照同COW(Copy On Write)等功能.

  • Ceph块设备
    ditaa-dc9f80d771b55f2daa5cbbfdb2dd0d3e6dfc17c0

Ceph对象网关

  • RADWOGW的功能特性是基于LibRADOS之上,提供当前流行的RESTful协议的网关,并且兼容AWS S3Swift接口,作为对象存储,可以对接网盘类应用以及HLS的流媒体应用等.
    ditaa-50d12451eb76c5c72c4574b08f0320b39a42e5f1

  • 体系结构
    stack

通过Ceph/ceph-ansiable安装

  • 关于Releases版本的特别说明:
    • stable-3.0支持ceph jewel 和 luminous 版本.该branch需要Ansible 2.4`版本.
    • stable-3.1支持ceph luminous 和 mimic 版本.该branch需要Ansible 2.4`版本.
    • stable-3.2支持ceph luminous 和 mimic 版本.该branch需要Ansible 2.6`版本.
    • stable-4.0支持ceph nautilus 版本.该branch需要Ansible 2.8`版本.
    • master支持Ceph@master版本.该branch需要Ansible 2.8版本.
1
2
3
~$ git clone  https://github.com/ceph/ceph-ansible
~$ git checkout -b v3.2.9
~$ pip install -r ceph-ansible/requirements.txt

通过Ceph/ceph-deploy安装

快速安装(apt)

  • Ceph-deploy是比较旧的部署方式,过程稍复杂.经测试使用apt的方式安装不到ceph-deploy,这里通过pip install ceph-deploy安装成功.
  • 下面使用VirtualBox创建虚拟机来做实验.创建一个Linux虚拟机,安装debian9,设置两个网卡,一个是NAT(10.0.2.0/24)用于外网下载软件使用,一个是Vboxnet1(192.168.99.0/24)集群通信使用.安装一些常的工具软件,并克隆4个新的虚拟机,更改它的主机名与IP地址.下面会用到Ansible来批量操作这些虚拟机.安装虚假机结构如下:

cffd08dd3e192a5f1d724ad7930cb04200b9b425

  • 在上述的节点虚拟器机里安装apt get ntp ntpdate ntp-doc时间服务器相关包,并配置好SSH公钥免农密登录.这里可以使用Ansible变量操作.
  • 注意,各个节点里的/etc/hosts要与ceph-deploy操作的主机是一致的.否则会出现ceph-deploy mon create-initial无法进行的错误.

清理旧节点

1
2
3
4
~$ ceph-deploy purge {ceph-node} [{ceph-node}]
~$ ceph-deploy purgedata {ceph-node} [{ceph-node}]
~$ ceph-deploy forgetkeys
~$ rm ceph.*

安装节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
~$ ceph-deploy new --help
usage: ceph-deploy new [-h] [--no-ssh-copykey] [--fsid FSID]
[--cluster-network CLUSTER_NETWORK]
[--public-network PUBLIC_NETWORK]
MON [MON ...]

Start deploying a new cluster, and write a CLUSTER.conf and keyring for it.

positional arguments:
MON initial monitor hostname, fqdn, or hostname:fqdn pair

optional arguments:
-h, --help show this help message and exit
--no-ssh-copykey do not attempt to copy SSH keys
--fsid FSID provide an alternate FSID for ceph.conf generation
--cluster-network CLUSTER_NETWORK
specify the (internal) cluster network
--public-network PUBLIC_NETWORK
specify the public network for a cluster

~$ ceph-deploy new node1 node2 node3 --public-network 192.168.99.0/24

# 下面命令,脚本会通过ssh到每个节点上安装相应的ceph包.类似于 apt install ceph ceph-base ceph-common ceph-mds ceph-mon ceph-osd radosgw
# 因为使用的是cepy-deplopy 2.0.x,必须要指定为luminous(v12)以上版本,否则它是默认安装Mimic(v10)的版本.现在最新版是 Nautilus(v14.0.2)
~$ ceph-deploy install --release luminous node1 node2 node3
# 可以加入这两个参数,加速安装 --repo-url http://mirrors.ustc.edu.cn/ceph/debian-luminous --gpg-url http://mirrors.ustc.edu.cn/ceph/keys/release.asc
# 上述命令后,会在当前目录下创建ceph.conf,ceph-mon.keyring
  • Pool,PG(Placement Groups)CRUSH配置参考,官方文档,下面PG参数的调校可以参照这里PgCalc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# --> ceph.conf
[global]

# By default, Ceph makes 3 replicas of objects. If you want to make four
# copies of an object the default value--a primary copy and three replica
# copies--reset the default values as shown in 'osd pool default size'.
# If you want to allow Ceph to write a lesser number of copies in a degraded
# state, set 'osd pool default min size' to a number less than the
# 'osd pool default size' value.

osd pool default size = 3 # Write an object 3 times.
osd pool default min size = 2 # Allow writing two copies in a degraded state.

# Ensure you have a realistic number of placement groups. We recommend
# approximately 100 per OSD. E.g., total number of OSDs multiplied by 100
# divided by the number of replicas (i.e., osd pool default size). So for
# 10 OSDs and osd pool default size = 4, we'd recommend approximately
# (100 * 10) / 4 = 250.

osd pool default pg num = 250
osd pool default pgp num = 250
  • 初始化Monitors

    1
    2
    ~$ ceph-deploy mon create node1 node2 node3
    ~$ ceph-deploy gatherkeys node1 node2 node3
  • 注意:如果出现下面的错误,可能是系统的空间小于5%.具体错误细节可以查看/var/log/ceph/ceph-mon.DB001.log

    1
    2
    [DB001][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.DB001.asok mon_status
    [DB001][ERROR ] b'admin_socket: exception getting command descriptions: [Errno 2] No such file or directory'
  • 会在当前目录下,创建如下文件

    • {cluster-name}.client.admin.keyring
    • {cluster-name}.mon.keyring
    • {cluster-name}.bootstrap-osd.keyring
    • {cluster-name}.bootstrap-mds.keyring
    • {cluster-name}.bootstrap-rgw.keyring
    • {cluster-name}.bootstrap-mgr.keyring
  • 分发 ceph 配置和 keys 到集群的节点中去:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    $ ceph-deploy admin node1 node2 node3
    [ceph_deploy.conf][DEBUG ] found configuration file at: /home/lcy/.cephdeploy.conf
    [ceph_deploy.cli][INFO ] Invoked (2.0.1): /home/lcy/.pyenv/versions/py3dev/bin/ceph-deploy admin node1 node2 node3
    [ceph_deploy.cli][INFO ] ceph-deploy options:
    [ceph_deploy.cli][INFO ] verbose : False
    [ceph_deploy.cli][INFO ] quiet : False
    [ceph_deploy.cli][INFO ] username : None
    [ceph_deploy.cli][INFO ] overwrite_conf : False
    [ceph_deploy.cli][INFO ] ceph_conf : None
    [ceph_deploy.cli][INFO ] cluster : ceph
    [ceph_deploy.cli][INFO ] client : ['node1', 'node2', 'node3']
    [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fef98f77390>
    [ceph_deploy.cli][INFO ] default_release : False
    [ceph_deploy.cli][INFO ] func : <function admin at 0x7fef99bdd6a8>
    [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to node1
    [node1][DEBUG ] connection detected need for sudo
    [node1][DEBUG ] connected to host: node1
    [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to node2
    [node2][DEBUG ] connection detected need for sudo
    [node2][DEBUG ] connected to host: node2
    [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to node3
    [node3][DEBUG ] connection detected need for sudo
    [node3][DEBUG ] connected to host: node3
  • 查看集群的状态,可直接登录用root权限运行,或者Ansible命令运行:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    ~$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
    192.168.99.101 | CHANGED | rc=0 >>
    cluster:
    id: 0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_OK

    services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in

    data:
    pools: 0 pools, 0 pgs
    objects: 0 objects, 0B
    usage: 0B used, 0B / 0B avail
    pgs:

Ceph Manager部署

  • 参考文档
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    ~$ ceph-deploy mgr create node1 node2 node3
    [...]
    ~$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
    192.168.99.101 | CHANGED | rc=0 >>
    cluster:
    id: 0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_OK

    services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: node1(active), standbys: node3, node2
    osd: 0 osds: 0 up, 0 in

    data:
    pools: 0 pools, 0 pgs
    objects: 0 objects, 0B
    usage: 0B used, 0B / 0B avail
    pgs:

Ceph OSD部署

  • ceph-volume
  • Ceph Luminous 12.2.2开始,ceph-disk被弃用,使用ceph-volume代替.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
~$ ceph-deploy osd -h
usage: ceph-deploy osd [-h] {list,create} ...

Create OSDs from a data disk on a remote host:

ceph-deploy osd create {node} --data /path/to/device

For bluestore, optional devices can be used::

ceph-deploy osd create {node} --data /path/to/data --block-db /path/to/db-device
ceph-deploy osd create {node} --data /path/to/data --block-wal /path/to/wal-device
ceph-deploy osd create {node} --data /path/to/data --block-db /path/to/db-device --block-wal /path/to/wal-device

For filestore, the journal must be specified, as well as the objectstore::

ceph-deploy osd create {node} --filestore --data /path/to/data --journal /path/to/journal

For data devices, it can be an existing logical volume in the format of:
vg/lv, or a device. For other OSD components like wal, db, and journal, it
can be logical volume (in vg/lv format) or it must be a GPT partition.

positional arguments:
{list,create}
list List OSD info from remote host(s)
create Create new Ceph OSD daemon by preparing and activating a
device

optional arguments:
-h, --help show this help message and exit

~$ ceph-deploy osd create -h
usage: ceph-deploy osd create [-h] [--data DATA] [--journal JOURNAL]
[--zap-disk] [--fs-type FS_TYPE] [--dmcrypt]
[--dmcrypt-key-dir KEYDIR] [--filestore]
[--bluestore] [--block-db BLOCK_DB]
[--block-wal BLOCK_WAL] [--debug]
[HOST]

positional arguments:
HOST Remote host to connect

optional arguments:
-h, --help show this help message and exit
--data DATA The OSD data logical volume (vg/lv) or absolute path
to device
--journal JOURNAL Logical Volume (vg/lv) or path to GPT partition
--zap-disk DEPRECATED - cannot zap when creating an OSD
--fs-type FS_TYPE filesystem to use to format DEVICE (xfs, btrfs)
--dmcrypt use dm-crypt on DEVICE
--dmcrypt-key-dir KEYDIR
directory where dm-crypt keys are stored
--filestore filestore objectstore
--bluestore bluestore objectstore
--block-db BLOCK_DB bluestore block.db path
--block-wal BLOCK_WAL
bluestore block.wal path
--debug Enable debug mode on remote ceph-volume calls
  • node1添加了一个磁盘作OSD盘,下面把它整个盘创建成一个块设备.
1
2
3
~$ ceph-deploy osd create node1 --data /dev/vdb
~$ ceph-deploy osd create node2 --data /dev/vdb
~$ ceph-deploy osd create node3 --data /dev/vdb
  • 出错消息,如果原盘里面有LVM的信息,要先手动清除原LVM信息,不然会出现下面错误.先用lvdisplay查看,再用lvremove --force,vgdisplay,vgremove --force,清除原有的 LVM 信息.
1
2
3
[DB001][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs
  • 清除原LVM信息

    1
    ~$ ansible -i hosts all  -b -m shell  -a "lvdisplay | awk 'NR==2 {print $3}'| xargs  lvremove --force ;  vgdisplay | awk 'NR==2 {print $3}' | xargs  vgremove"
  • 查看osd状态

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    ~$ ansible -i ../hosts node1 -b -m command -a "ceph osd stat"
    3 osds: 3 up, 3 in

    ~$ ansible -i ../hosts node1 -b -m command -a "ceph df"
    GLOBAL:
    SIZE AVAIL RAW USED %RAW USED
    180GiB 177GiB 3.02GiB 1.68
    POOLS:
    NAME ID USED %USED MAX AVAIL OBJECTS
    hdd 1 375B 0 84.0GiB 8
    cephfs_data 2 0B 0 84.0GiB 0
    cephfs_metadata 3 2.19KiB 0 84.0GiB 21
  • 查看osd

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    $ ansible -i hosts node1  -b -m command -a "ceph osd tree"

    node1 | CHANGED | rc=0 >>
    ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
    -1 0.17578 root default
    -3 0.05859 host node1
    0 hdd 0.05859 osd.0 up 1.00000 1.00000
    -7 0.05859 host node2
    2 hdd 0.05859 osd.2 up 1.00000 1.00000
    -5 0.05859 host node3
    1 hdd 0.05859 osd.1 up 1.00000 1.00000
  • 查看系统状态

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    $ ansible -i ../hosts node1 -b -m command -a "ceph -s"
    192.168.99.101 | CHANGED | rc=0 >>
    cluster:
    id: 0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_OK

    services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: node1(active), standbys: node3, node2
    osd: 1 osds: 1 up, 1 in

    data:
    pools: 0 pools, 0 pgs
    objects: 0 objects, 0B
    usage: 1.00GiB used, 9.00GiB / 10.0GiB avail # 这里是node1:/dev/sdb刚创建的.
    pgs:
    # 如果要使用管道操作,必须使用shell模块,command模块会出错.
    ~$ ansible -i ../hosts node1 -b -m shell -a "mount | grep ceph"
    tmpfs on /var/lib/ceph/osd/ceph-0 type tmpfs (rw,relatime)

    ~$ ansible -i ../hosts node1 -b -m command -a "ls -l /var/lib/ceph/"
    192.168.99.101 | CHANGED | rc=0 >>
    total 44
    drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 bootstrap-mds
    drwxr-xr-x 2 ceph ceph 4096 May 9 21:33 bootstrap-mgr
    drwxr-xr-x 2 ceph ceph 4096 May 9 22:27 bootstrap-osd
    drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 bootstrap-rbd
    drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 bootstrap-rgw
    drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 mds
    drwxr-xr-x 3 ceph ceph 4096 May 9 21:33 mgr
    drwxr-xr-x 3 ceph ceph 4096 May 9 21:22 mon
    drwxr-xr-x 3 ceph ceph 4096 May 9 22:27 osd
    drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 radosgw
    drwxr-xr-x 2 ceph ceph 4096 May 9 21:22 tmp

    ~$ ansible -i ../hosts node1 -b -m command -a "ls -l /var/lib/ceph/osd/ceph-0"
    192.168.99.101 | CHANGED | rc=0 >>
    total 48
    -rw-r--r-- 1 ceph ceph 393 May 9 22:27 activate.monmap
    lrwxrwxrwx 1 ceph ceph 93 May 9 22:27 block -> /dev/ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024/osd-block-261c9455-fbc4-4eba-9783-5fba4290048d
    -rw-r--r-- 1 ceph ceph 2 May 9 22:27 bluefs
    -rw-r--r-- 1 ceph ceph 37 May 9 22:27 ceph_fsid
    -rw-r--r-- 1 ceph ceph 37 May 9 22:27 fsid
    -rw------- 1 ceph ceph 55 May 9 22:27 keyring
    -rw-r--r-- 1 ceph ceph 8 May 9 22:27 kv_backend
    -rw-r--r-- 1 ceph ceph 21 May 9 22:27 magic
    -rw-r--r-- 1 ceph ceph 4 May 9 22:27 mkfs_done
    -rw-r--r-- 1 ceph ceph 41 May 9 22:27 osd_key
    -rw-r--r-- 1 ceph ceph 6 May 9 22:27 ready
    -rw-r--r-- 1 ceph ceph 10 May 9 22:27 type
    -rw-r--r-- 1 ceph ceph 2 May 9 22:27 whoami
  • 查看Ceph的参数配置项

    1
    2
    3
    4
    5
    6
    ~$ ansible -i ../hosts node1 -b -m command -a "ceph --show-config"
    name = client.admin
    cluster = ceph
    debug_none = 0/5
    debug_lockdep = 0/1
    [....]
  • 查看LVM相关信息

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    ~$ ansible -i ../hosts node1 -b -m command -a "pvdisplay"
    192.168.99.101 | CHANGED | rc=0 >>
    --- Physical volume ---
    PV Name /dev/sdb
    VG Name ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024
    PV Size 10.00 GiB / not usable 4.00 MiB
    Allocatable yes (but full)
    PE Size 4.00 MiB
    Total PE 2559
    Free PE 0
    Allocated PE 2559
    PV UUID Qd6kSs-Ivbp-3APy-21Tv-XQgx-EhBn-XfioVa

    ~$ ansible -i ../hosts node1 -b -m command -a "vgdisplay"
    192.168.99.101 | CHANGED | rc=0 >>
    --- Volume group ---
    VG Name ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024
    System ID
    Format lvm2
    Metadata Areas 1
    Metadata Sequence No 17
    VG Access read/write
    VG Status resizable
    MAX LV 0
    Cur LV 1
    Open LV 1
    Max PV 0
    Cur PV 1
    Act PV 1
    VG Size 10.00 GiB
    PE Size 4.00 MiB
    Total PE 2559
    Alloc PE / Size 2559 / 10.00 GiB
    Free PE / Size 0 / 0
    VG UUID XiVkQ6-aUPv-3BRw-Gj1N-jdG4-HRxf-hCS3Mg

    ~$ ansible -i ../hosts node1 -b -m command -a "lvdisplay"
    192.168.99.101 | CHANGED | rc=0 >>
    --- Logical volume ---
    LV Path /dev/ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024/osd-block-261c9455-fbc4-4eba-9783-5fba4290048d
    LV Name osd-block-261c9455-fbc4-4eba-9783-5fba4290048d
    VG Name ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024
    LV UUID F9dF0S-qwb7-LJC0-vld2-TF6g-nP8q-9ncsdI
    LV Write Access read/write
    LV Creation host, time node1, 2019-05-09 22:27:44 -0400
    LV Status available
    # open 4
    LV Size 10.00 GiB
    Current LE 2559
    Segments 1
    Allocation inherit
    Read ahead sectors auto
    - currently set to 256
    Block device 253:0
  • 关于LVM的关系.简单描述一下:LV是建立在VG上,VG建立在PV上面.

  • 下面关闭node2,为添加一块20G的盘,测试其它BlueStore类型.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    ~$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
    192.168.99.101 | CHANGED | rc=0 >>
    cluster:
    id: 0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_WARN
    1/3 mons down, quorum node1,node3 # 警告有一个节点shutdown.

    services:
    mon: 3 daemons, quorum node1,node3, out of quorum: node2
    mgr: node1(active), standbys: node3
    osd: 1 osds: 1 up, 1 in

    data:
    pools: 0 pools, 0 pgs
    objects: 0 objects, 0B
    usage: 1.00GiB used, 9.00GiB / 10.0GiB avail
    pgs:

Parted(GPT 分区)

  • 如果使用fdisk(MBR)分区会报错,下面使用parted(GPT)分区.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    root@node2:~# parted /dev/sdb
    GNU Parted 3.2
    Using /dev/sdb
    Welcome to GNU Parted! Type 'help' to view a list of commands.
    (parted) mklabel gpt
    (parted) print
    Model: ATA VBOX HARDDISK (scsi)
    Disk /dev/sdb: 21.5GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    Disk Flags:

    Number Start End Size File system Name Flags

    (parted) mkpart parimary 0 10G
    Warning: The resulting partition is not properly aligned for best performance.
    Ignore/Cancel?
    Ignore/Cancel? Ignore
    (parted) print
    Model: ATA VBOX HARDDISK (scsi)
    Disk /dev/sdb: 21.5GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    Disk Flags:

    Number Start End Size File system Name Flags
    1 17.4kB 10.0GB 10000MB parimary

    (parted) mkpart parimary 10G 21.5G
    (parted) p
    Model: ATA VBOX HARDDISK (scsi)
    Disk /dev/sdb: 21.5GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    Disk Flags:

    Number Start End Size File system Name Flags
    1 17.4kB 10.0GB 10000MB parimary
    2 10.0GB 21.5GB 11.5GB parimary
    (parted) q

    root@node2:~# partx /dev/sdb
    NR START END SECTORS SIZE NAME UUID
    1 34 19531250 19531217 9.3G parimary a8c625b7-ebf2-4ceb-a9fd-5371dde59b35
    2 19531776 41940991 22409216 10.7G parimary 6463703f-c1f3-4ad7-8870-ed634db64131

    root@node2:~# lsblk /dev/sdb
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sdb 8:16 0 20G 0 disk
    ├─sdb1 8:17 0 9.3G 0 part
    └─sdb2 8:18 0 10.7G 0 part
  • 添加osd.1(block,block.db)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
~$ ceph-deploy osd create node2 --data /dev/sdb2 --block-db /dev/sdb1
[...]
[node2][INFO ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb2 --block.db /dev/sdb1
[node2][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
[node2][INFO ] checking OSD status...
[node2][INFO ] Running command: sudo /usr/bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host node2 is now ready for osd use.

# 查看集群状态.
~$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
192.168.99.101 | CHANGED | rc=0 >>
cluster:
id: 0bf150da-b691-4382-bf3d-600e90c19fba
health: HEALTH_OK

services:
mon: 3 daemons, quorum node1,node2,node3
mgr: node1(active), standbys: node3, node2
osd: 2 osds: 2 up, 2 in

data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 2.00GiB used, 18.7GiB / 20.7GiB avail
pgs:
# 查看node1上机的ceph-1
~$ ansible -i ../hosts node2 -b -m command -a "ls -l /var/lib/ceph/osd/ceph-1"
192.168.99.102 | CHANGED | rc=0 >>
total 48
-rw-r--r-- 1 ceph ceph 393 May 9 23:53 activate.monmap
lrwxrwxrwx 1 ceph ceph 93 May 9 23:53 block -> /dev/ceph-98f53d51-8e74-4ca3-8b7a-87570c01733e/osd-block-f572ef53-805e-48ff-b936-da520e46be6b
lrwxrwxrwx 1 ceph ceph 9 May 9 23:53 block.db -> /dev/sdb1
-rw-r--r-- 1 ceph ceph 2 May 9 23:53 bluefs
-rw-r--r-- 1 ceph ceph 37 May 9 23:53 ceph_fsid
-rw-r--r-- 1 ceph ceph 37 May 9 23:53 fsid
-rw------- 1 ceph ceph 55 May 9 23:53 keyring
-rw-r--r-- 1 ceph ceph 8 May 9 23:53 kv_backend
-rw-r--r-- 1 ceph ceph 21 May 9 23:53 magic
-rw-r--r-- 1 ceph ceph 4 May 9 23:53 mkfs_done
-rw-r--r-- 1 ceph ceph 41 May 9 23:53 osd_key
-rw-r--r-- 1 ceph ceph 6 May 9 23:53 ready
-rw-r--r-- 1 ceph ceph 10 May 9 23:53 type
-rw-r--r-- 1 ceph ceph 2 May 9 23:53 whoami

创建MDS服务器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
~$ ceph-deploy mds create  FE001 DIG001
# 查看状态
~$ ansible -i ../hosts node1 -b -m command -a "ceph mds stat"

~$ ansible -i ../hosts node1 -b -m command -a "ceph osd pool create cephfs_data 64 64"
pool 'cephfs_data' created

~$ ansible -i ../hosts node1 -b -m command -a "ceph osd pool create cephfs_metadata 64 64"
pool 'cephfs_metadata' created

# 创建文件系统
~$ ansible -i ../hosts node1 -b -m command -a "ceph fs new cephfs cephfs_metadata cephfs_data"
new fs with metadata pool 3 and data pool 2

~$ ansible -i ../hosts node1 -b -m command -a "ceph mds stat"
cephfs-1/1/1 up {0=DIG001=up:active}, 1 up:standby
~$ ansible -i ../hosts node1 -b -m command -a "ceph fs ls"
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]

~$ ansible -i ../hosts node1 -b -m command -a "ceph fs status"
cephfs - 0 clients
======
+------+--------+--------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+--------+---------------+-------+-------+
| 0 | active | DIG001 | Reqs: 0 /s | 10 | 12 |
+------+--------+--------+---------------+-------+-------+
+-----------------+----------+-------+-------+
| Pool | type | used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 2246 | 83.9G |
| cephfs_data | data | 0 | 83.9G |
+-----------------+----------+-------+-------+

+-------------+
| Standby MDS |
+-------------+
| FE001 |
+-------------+
MDS version: ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)

# 查看元数据.
~$ sudo ceph osd metadata osd.2
  • 挂载到文件系统,挂载文件系统,可以使用/etc/ceph/ceph.client.admin.keyring里的 key,也可以按照下面,新建一个用户与 key.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    ~$ sudo ceph auth get-or-create client.cephfs mon 'allow r' mds 'allow rw' osd 'allow rw pool=cephfs-data, allow rw pool=cephfs-metadata'
    ~$ sudo ceph auth get client.cephfs
    exported keyring for client.cephfs
    [client.cephfs]
    key = AQDAwhldGXL3GhAAGsHu3XYUIwzS6z0SOcLMFA==
    caps mds = "allow rw"
    caps mon = "allow r"
    caps osd = "allow rw pool=cephfs-data, allow rw pool=cephfs-metadata"

    ~$ sudo mount.ceph node1:6789:/ /data -o name=cephfs,secret=AQDAwhldGXL3GhAAGsHu3XYUIwzS6z0SOcLMFA==
  • 上述的挂载的方式,会在Shell里看到key,不安全.可以把AQDAwhldGXL3GhAAGsHu3XYUIwzS6z0SOcLMFA==这个Base64的密钥字段保存成一个文件,加上chmod 400的权限.

    1
    2
    3
    4
    ~$ sudo mount.ceph node1:6789:/ /data -o name=cephfs,secretfile=/etc/ceph/cephfs.secret

    # 加入自动挂载
    ~$ echo "mon1:6789,mon2:6789,mon3:6789:/ /cephfs ceph name=cephfs,secretfile=/etc/ceph/cephfs.secret,_netdev,noatime 0 0" | sudo tee -a /etc/fstab
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[2019-07-01 14:20:45,567][ceph_volume.process][INFO ] stdout ceph.block_device=/dev/ceph-bd417a6a-cef6-4ff5-828a-5b68ec8843f0/osd-block-dcde5f54-c555-41ee-8c20-586f1069bcb7,ceph.block_uuid=wHZD0b-lU7P-vYFg-XOBI-zknV-Q181-0xKtt3,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=d7f63adc-33d1-4ae9-9ba7-ae401950d965,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=dcde5f54-c555-41ee-8c20-586f1069bcb7,ceph.osd_id=1,ceph.type=block,ceph.vdo=0";"/dev/ceph-bd417a6a-cef6-4ff5-828a-5b68ec8843f0/osd-block-dcde5f54-c555-41ee-8c20-586f1069bcb7";"osd-block-dcde5f54-c555-41ee-8c20-586f1069bcb7";"ceph-bd417a6a-cef6-4ff5-828a-5b68ec8843f0";"wHZD0b-lU7P-vYFg-XOBI-zknV-Q181-0xKtt3";"60.00g
[2019-07-01 14:20:45,567][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, \*\*kw)
File "/usr/lib/python2.7/dist-packages/ceph_volume/main.py", line 148, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, in dispatch
instance.main()
File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/main.py", line 40, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, in dispatch
instance.main()
File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/trigger.py", line 70, in main
Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main()
File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 339, in main
self.activate(args)
File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(\*a, **kw)
File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 249, in activate
raise RuntimeError('could not find osd.%s with fsid %s' % (osd_id, osd_fsid))
RuntimeError: could not find osd.1 with fsid 3aeba7b7-f539-4b6a-afac-fc9fd62b90fa
1
2
3
~$ sudo lvs -o lv_tags
LV Tags
ceph.block_device=/dev/ceph-9c0a0bae-d6db-498a-bf20-fe4cd8bdb3a9/osd-block-5c5a950b-8b36-4935-be8c-b59c24073874,ceph.block_uuid=yY970H-ztZ4-VtfA-2L9d-k3cF-Zi44-0i8MB1,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=d7f63adc-33d1-4ae9-9ba7-ae401950d965,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=5c5a950b-8b36-4935-be8c-b59c24073874,ceph.osd_id=2,ceph.type=block,ceph.vdo=0
  • 可以在Ceph集群之外的服务器来安装RGW,需要安装ceph-radosgw包,如:ceph-deploy install --rgw <rgw-node> [<rgw-node>...].下面为了方便起见,我直接在node3,node4上面安装RGW.

  • 添加一个mon节点

    1
    ~$ ceph-deploy mon add node4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# node3在前面被管理过了.
~$ ceph-deploy admin node4
~$ ceph-deploy rgw create node3 node4
[...]
~$ ansible -i ../hosts node3 -b -m shell -a "ps -ef | grep rgw"
192.168.99.103 | CHANGED | rc=0 >>
ceph 4272 1 0 02:05 ? 00:00:00 /usr/bin/radosgw -f --cluster ceph --name client.rgw.node3 --setuser ceph --setgroup ceph
root 5040 5039 0 02:06 pts/0 00:00:00 /bin/sh -c ps -ef | grep rgw
root 5042 5040 0 02:06 pts/0 00:00:00 grep rgw

~$ ansible -i ../hosts node4 -b -m shell -a "ps -ef | grep rgw"
192.168.99.104 | CHANGED | rc=0 >>
ceph 3411 1 0 02:05 ? 00:00:00 /usr/bin/radosgw -f --cluster ceph --name client.rgw.node4 --setuser ceph --setgroup ceph
root 4211 4210 0 02:07 pts/0 00:00:00 /bin/sh -c ps -ef | grep rgw
root 4213 4211 0 02:07 pts/0 00:00:00 grep rgw

# http 测试访问
~$ curl node3:7480
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
  • 修改 RGW 的默认端口.在 ceph.conf 加入下面两行
1
2
3
[client.rgw.node4]
# rgw_frontends = "civetweb port=80"
rgw_frontends = civetweb port=80+443s ssl_certificate=/etc/ceph/private/keyandcert.pem
  • 上传当前目录下的配置文件到指定的在节点上去.
    1
    2
    ~$ ceph-deploy --overwrite-conf config push node4
    ~$ ansible -i hosts node4 -b -m systemd -a "name=radosgw state=restarted daemon_reload=yes"

civetweb配置

  • 如果不是通过ceph-deploy部署的集群需要通过下面的流程,手动配置添加RGW

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    # 创建keyring
    ~$ sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.radosgw.keyring
    # 生成密钥匙
    ~$ sudo ceph-authtool /etc/ceph/ceph.client.radosgw.keyring -n client.rgw.node3 --gen-key
    # 设置权限
    ~$ sudo ceph-authtool -n client.rgw.node3 --cap osd 'allow rwx' --cap mon 'allow rwx' /etc/ceph/ceph.client.radosgw.keyring
    # 导入keyring到集群中
    ~$ sudo ceph -k /etc/ceph/ceph.client.admin.keyring auth add client.rgw.node3 -i /etc/ceph/ceph.client.radosgw.keyring
    ~$ cat /etc/ceph/ceph.conf
    [...]
    [client.rgw.node3]
    rgw_frontends = civetweb port=80
    host=node3
    rgw_s3_auth_use_keystone=false
    keyring=/etc/ceph/ceph.client.radosgw.keyring
    log file=/var/log/ceph/client.radosgw.gateway.log
  • 这里是通过ceph-deploy部署的,只需导出相应到/etc/ceph/ceph.client.radosgw.keyring

    1
    2
    3
    4
    5
    6
    7
    ~$ sudo ceph auth get client.rgw.node3
    exported keyring for client.rgw.node3
    # 把下面这行复制,并创建到 /etc/ceph/ceph.client.radosgw.keyring
    [client.rgw.node3]
    key = AQC8FNVcl07ALRAAfhr+APpuKW/VvknEzD7hpg==
    caps mon = "allow rw"
    caps osd = "allow rwx"
  • 测试访问

    1
    ~$ sudo radosgw --cluster ceph --name client.rgw.node3 --setuser ceph --setgroup ceph -d --debug_ms 1 --keyring /etc/ceph/ceph.client.radosgw.keyring
  • 如果一切正常开启,就使用systemctl restart ceph-radosgw@rgw.node3重启它的服务,如果服务有错,使用journalctl -u ceph-radosgw@rgw.node3查看.

客户端访问

创建S3用户

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
$ ansible -i ../hosts node4 -b -m command -a "radosgw-admin user create --uid=\"lcy\" --display-name=\"admin user test\""
192.168.99.104 | CHANGED | rc=0 >>
{
"user_id": "lcy",
"display_name": "admin user test",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": [],
"keys": [
{
"user": "lcy",
"access_key": "74I2DQ89N5EL1OGCCSCV", # s3cmd必须提供
"secret_key": "ePz9ONOrZS4BB8RN44KBYxCzRA0UNz8Kyu5kXzvE" # s3cmd必须提供
}
],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"temp_url_keys": [],
"type": "rgw"
}
~$ ansible -i ../hosts node4 -b -m command -a "radosgw-admin user list"
192.168.99.104 | CHANGED | rc=0 >>
[
"testuser",
"lcy"
]

创建Swift用户

  • Swift用户是作为子用户被创建,因此要先创建用户,如下:lcy
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    ~$ ansible -i ../hosts node4 -b -m command -a "radosgw-admin subuser create --uid=lcy --subuser=lcy:swift --access=full"
    192.168.99.104 | CHANGED | rc=0 >>
    {
    "user_id": "lcy",
    "display_name": "admin user test",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "auid": 0,
    "subusers": [
    {
    "id": "lcy:swift",
    "permissions": "full-control"
    }
    ],
    "keys": [
    {
    "user": "lcy",
    "access_key": "74I2DQ89N5EL1OGCCSCV",
    "secret_key": "ePz9ONOrZS4BB8RN44KBYxCzRA0UNz8Kyu5kXzvE"
    }
    ],
    "swift_keys": [
    {
    "user": "lcy:swift",
    "secret_key": "bw2zByEnhZMzpSvrb9tYi5rjOT8mK69SkuuWFN8j"
    }
    ],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "placement_tags": [],
    "bucket_quota": {
    "enabled": false,
    "check_on_raw": false,
    "max_size": -1,
    "max_size_kb": 0,
    "max_objects": -1
    },
    "user_quota": {
    "enabled": false,
    "check_on_raw": false,
    "max_size": -1,
    "max_size_kb": 0,
    "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw"
    }

使用Python客户端库测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
~$ pip install boto python-swiftclient
~$ ipython
In [1]: access_key = '74I2DQ89N5EL1OGCCSCV'
In [2]: secret_key = 'ePz9ONOrZS4BB8RN44KBYxCzRA0UNz8Kyu5kXzvE'
In [3]: import boto.s3.connection
In [4]: conn = boto.connect_s3(aws_access_key_id=access_key,aws_secret_access_key=secret_key,host='192.168.99.103',port=7480,is_secure=False,calling_format=boto.s3.connection.OrdinaryCallingFormat())
In [5]: bkt = conn.create_bucket('ooo-bucket')
In [6]: for bkt in conn.get_all_buckets():
...: print("{name} {created}".format(name=bkt.name,created=bkt.creation_date))
...: # 创建并获取成功.
ooo-bucket 2019-05-10T07:08:26.456Z
# 使用swift client 测试.
~$ swift -A http://node4:7480/auth/1.0 -U lcy:swift -K 'bw2zByEnhZMzpSvrb9tYi5rjOT8mK69SkuuWFN8j' list
ooo-bucket

使用s3cmd测试

  • 使用s3cmd之前,需要先使用s3cmd --configure交互设置好相应的参数.这里跳过直接写入一些必要的连接参数.这里可以配置任一rgw(node3,node4)节点测试.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    ~$ sudo apt instal s3cmd
    ~$ cat <<EOF > ~/.s3cfg
    [default]
    access_key = 74I2DQ89N5EL1OGCCSCV
    host_base = node3:7480
    host_bucket = node3:7480/%(bucket)
    secret_key = ePz9ONOrZS4BB8RN44KBYxCzRA0UNz8Kyu5kXzvE
    cloudfront_host = node3:7480
    use_https = False
    bucket_location = US
    EOF

    # 列出所有桶
    ~$ s3cmd ls
    2019-05-10 07:08 s3://ooo-bucket
    # 创建桶
    ~$ s3cmd mb s3://sql
    # 上传文件进桶
    ~$ s3cmd put ~/wxdb-20190422-1638.sql s3://sql
    upload: '/home/lcy/wxdb-20190422-1638.sql' -> 's3://sql/wxdb-20190422-1638.sql' [1 of 1]
    197980 of 197980 100% in 1s 104.33 kB/s done
    # 列出桶里的文件
    ~$ s3cmd ls s3://sql
    2019-05-10 08:12 197980 s3://sql/wxdb-20190422-1638.sql
    # 下载桶里的文件到本地.
    ~$ s3cmd get s3://sql/wxdb-20190422-1638.sql
    download: 's3://sql/wxdb-20190422-1638.sql' -> './wxdb-20190422-1638.sql' [1 of 1]
    197980 of 197980 100% in 0s 57.23 MB/s done
  • 查看集群利用率统计

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    ~$ ansible -i ../hosts node2 -b -m command -a "ceph df"
    192.168.99.102 | CHANGED | rc=0 >>
    GLOBAL:
    SIZE AVAIL RAW USED %RAW USED
    20.7GiB 18.7GiB 2.01GiB 9.72
    POOLS:
    NAME ID USED %USED MAX AVAIL OBJECTS
    .rgw.root 1 2.08KiB 0 5.83GiB 6
    default.rgw.control 2 0B 0 5.83GiB 8
    default.rgw.meta 3 2.13KiB 0 5.83GiB 12
    default.rgw.log 4 0B 0 5.83GiB 207
    default.rgw.buckets.index 5 0B 0 5.83GiB 3
    default.rgw.buckets.data 6 193KiB 0 5.83GiB 1

s3fs-fuse挂载文件系统

  • s3fs-fuse
  • Django 使用 AWS S3 存储文件参考
  • 原本想直接把ceph s3 bucket做为一个卷挂到 docker 上面,暂时没试验成功.下是如在的宿主机里挂载,再通过-v挂到 docker 上.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    ~$ sudo apt install s3fs fuse

    # 也可把它放在/etc/passwd-s3fs
    ~$ echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > ${HOME}/.passwd-s3fs && chmod 600 ${HOME}/.passwd-s3fs
    ~$ s3cmd ls
    2019-05-10 08:10 s3://iso
    2019-05-16 03:50 s3://media # 下面将把它挂载成一个文件目录.
    2019-05-10 07:08 s3://ooo-bucket
    2019-05-16 06:44 s3://public
    2019-05-10 08:10 s3://sql

    # 这里需注,ceph s3 必需使用use_path_request_style参数,因为它不是AWS原生的.
    ~$ s3fs media /data/s3fs -o allow_other,umask=022,use_path_request_style,url=http://node3

    ~$ df -h | grep s3fs
    s3fs 256T 0 256T 0% /data/s3fs

    ~$ grep s3fs /etc/mtab
    s3fs /data/s3fs fuse.s3fs rw,nosuid,nodev,relatime,user_id=1000,group_id=120,allow_other 0 0

    # 如是挂载不加umask,默认是0000,无访问权限.
    ~$ ls -l /data/s3fs/
    total 9397
    drwxr-xr-x 1 root root 0 Jan 1 1970 hls
    -rwxr-xr-x 1 root root 3100721 May 16 14:24 video.mp4
  • 如果需要调试问题,可加入-o dbglevel=info -f -o curldbg启动,具体还有其它功能,可以详查看它的github以及它的帮助命令.

警告错误类

  • 参考文档

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    $ ansible -i ../hosts node1 -b -m command -a "ceph -s"
    192.168.99.101 | CHANGED | rc=0 >>
    cluster:
    id: 0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_WARN
    Degraded data redundancy: 237/711 objects degraded (33.333%), 27 pgs degraded, 48 pgs undersized

    services:
    mon: 4 daemons, quorum node1,node2,node3,node4
    mgr: node1(active), standbys: node2, node3
    osd: 2 osds: 2 up, 2 in
    rgw: 2 daemons active

    data:
    pools: 6 pools, 48 pgs
    objects: 237 objects, 198KiB
    usage: 2.01GiB used, 18.7GiB / 20.7GiB avail
    pgs: 237/711 objects degraded (33.333%)
    27 active+undersized+degraded
    21 active+undersized
  • 根据上面的警告,数据中的pg降级,重启OSD的节点服务systemctl restart ceph-osd.target之后再看.

  • 后来仔细查看发现,是因为osd备份数量是3,而我这里只创建了两个osd,所以才会出现上述降级警告.可以修改备份数量为2,也可以再增加一个osd节点.

  • 下面也参照node2一样添加一个 20G 的盘,分成两个区,使用(block,block.wal)方式创建.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
~$ ceph-deploy osd create node3 --data /dev/sdb2 --block-wal /dev/sdb1

~$ ansible -i ../hosts node1 -b -m command -a "ceph osd tree"
192.168.99.101 | CHANGED | rc=0 >>
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.03058 root default
-3 0.00980 host node1
0 hdd 0.00980 osd.0 up 1.00000 1.00000
-5 0.01039 host node2
1 hdd 0.01039 osd.1 up 1.00000 1.00000
-7 0.01039 host node3
2 hdd 0.01039 osd.2 up 1.00000 1.00000

# 再次查看,状态已经正常了.
~$ ansible -i ../hosts node1 -b -m command -a "ceph health"
192.168.99.101 | CHANGED | rc=0 >>
HEALTH_OK
  • Ceph: HEALTH_WARN clock skew detected
1
2
# 把所有节点的ntp默认开机启动.
~$ ansible -i hosts all -b -m systemd -a "name=ntp enabled=yes state=started"
  • application not enabled on 1 pool(s) 警告处理
1
2
3
4
5
6
7
~$ sudo ceph health detail
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
application not enabled on pool 'kube'
use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications.
~$ sudo ceph osd pool application enable kube rbd
enabled application 'rbd' on pool 'kube'
  • 安装完成后各节点的服务列表如下:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    $ ansible -i hosts node -b -m command -a "netstat -tnlp"
    192.168.99.102 | CHANGED | rc=0 >>
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
    tcp 0 0 192.168.99.102:6789 0.0.0.0:* LISTEN 476/ceph-mon
    tcp 0 0 192.168.99.102:6800 0.0.0.0:* LISTEN 875/ceph-osd
    tcp 0 0 192.168.99.102:6801 0.0.0.0:* LISTEN 875/ceph-osd
    tcp 0 0 192.168.99.102:6802 0.0.0.0:* LISTEN 875/ceph-osd
    tcp 0 0 192.168.99.102:6803 0.0.0.0:* LISTEN 875/ceph-osd
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 529/sshd
    tcp6 0 0 :::22 :::* LISTEN 529/sshd

    192.168.99.101 | CHANGED | rc=0 >>
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
    tcp 0 0 192.168.99.101:6789 0.0.0.0:* LISTEN 480/ceph-mon
    tcp 0 0 192.168.99.101:6800 0.0.0.0:* LISTEN 1015/ceph-osd
    tcp 0 0 192.168.99.101:6801 0.0.0.0:* LISTEN 1015/ceph-osd
    tcp 0 0 192.168.99.101:6802 0.0.0.0:* LISTEN 1015/ceph-osd
    tcp 0 0 192.168.99.101:6803 0.0.0.0:* LISTEN 1015/ceph-osd
    tcp 0 0 192.168.99.101:6804 0.0.0.0:* LISTEN 476/ceph-mgr
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 537/sshd
    tcp6 0 0 :::22 :::* LISTEN 537/sshd

    192.168.99.103 | CHANGED | rc=0 >>
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
    tcp 0 0 192.168.99.103:6789 0.0.0.0:* LISTEN 479/ceph-mon
    tcp 0 0 192.168.99.103:6800 0.0.0.0:* LISTEN 965/ceph-osd
    tcp 0 0 192.168.99.103:6801 0.0.0.0:* LISTEN 965/ceph-osd
    tcp 0 0 192.168.99.103:6802 0.0.0.0:* LISTEN 965/ceph-osd
    tcp 0 0 192.168.99.103:6803 0.0.0.0:* LISTEN 965/ceph-osd
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 527/sshd
    tcp 0 0 0.0.0.0:7480 0.0.0.0:* LISTEN 480/radosgw
    tcp6 0 0 :::22 :::* LISTEN 527/sshd

    192.168.99.104 | CHANGED | rc=0 >>
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
    tcp 0 0 192.168.99.104:6789 0.0.0.0:* LISTEN 445/ceph-mon
    tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 447/radosgw
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 515/sshd
    tcp6 0 0 :::22 :::* LISTEN 515/sshd

Kubernetes集成

创建RBD

  • 操作RBD必须直接登录到服务器里操作,ceph-deploy没有提供相关的接口.可以使用Ansible进行远程指操作.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    # 关于如何计算池的pg数,可以参考 https://ceph.com/pgcalc/
    ~$ sudo ceph osd pool create kube 64 64
    pool 'kube' created

    # 设置存储池的副本数
    ~$ sudo ceph osd pool set kube size 2

    ~$ sudo ceph osd lspools
    1 .rgw.root,2 default.rgw.control,3 default.rgw.meta,4 default.rgw.log,5 default.rgw.buckets.index,6 default.rgw.buckets.data,7 volumes,8 kube,

    ~$ sudo rbd create kube/cephimage2 --size 40960
    ~$ sudo rbd list kube
    cephimage2

    ~$ sudo rbd info kube/cephimage2
    rbd image 'cephimage2':
    size 40GiB in 10240 objects
    order 22 (4MiB objects)
    block_name_prefix: rbd_data.519a06b8b4567
    format: 2
    #
    features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
    flags:
    create_timestamp: Mon May 13 01:44:35 2019

    ~$ sudo rbd create kube/cephimage1 --size 10240

    # 把 cepimage1原来10G大小,扩展至20G
    ~$ sudo rbd resize kube/cephimage1 --size 20480

    -$ sudo rbd create kube/cephimage3 --size 4096 --image-feature layering
  • 默认创建RBD的会开启(layering, exclusive-lock, object-map, fast-diff, deep-flatten)特性,低版本的linux kernel会不支持,一般低版本仅支持layering特性.如果内核版本过低,创建Pod时会出现下面的要错误.

1
2
3
4
MountVolume.WaitForAttach failed for volume "ceph-rbd-pv" : rbd: map failed exit status 6, rbd output: rbd: sysfs write failed RBD image feature set mismatch. Try disabling features unsupported by the kernel with "rbd feature disable". In some cases useful info is found in syslog - try "dmesg | tail". rbd: map failed: (6) No such device or address

~# dmesg
[1355258.253726] rbd: image foo: image uses unsupported features: 0x38
  • 创建集群Pod

    1
    2
    3
    4
    5
    6
    7
    8
    9
    ~$ git clone https://github.com/kubernetes/examples.git
    ~$ cd examples/staging/volumes/rbd/
    ~$ tree
    .
    ├── rbd-with-secret.yaml
    ├── rbd.yaml
    ├── README.md
    └── secret
    └── ceph-secret.yaml
  • 修改rbd-with-secret.yaml的内容如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    apiVersion: v1
    kind: Pod
    metadata:
    name: rbd2
    spec:
    containers:
    - image: busybox
    command: ["sleep", "60000"]
    name: rbd-rw
    volumeMounts:
    - name: rbdpd
    mountPath: /mnt/rbd
    volumes:
    - name: rbdpd
    rbd:
    monitors:
    - '192.168.99.101:6789'
    - '192.168.99.102:6789'
    - '192.168.99.103:6789'
    - '192.168.99.104:6789'
    pool: kube
    image: cephimage3
    fsType: ext4
    readOnly: false
    user: admin
    secretRef:
    name: ceph-secret
  • 修改ceph-secret,注意替换文件里的key字段.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    ~$ ansible -i hosts node1 -b -m command -a "cat /etc/ceph/ceph.client.admin.keyring" | grep key | awk '{printf "%s",$NF}' | base64
    QVFESDB0UmNFSStwR3hBQUJ4aW1ZT1VXRWVTckdzSStpZklCOWc9PQ==
    ~$ cat secret/ceph-secret.yaml
    apiVersion: v1
    kind: Secret
    metadata:
    name: ceph-secret
    type: "kubernetes.io/rbd"
    data:
    key: QVFESDB0UmNFSStwR3hBQUJ4aW1ZT1VXRWVTckdzSStpZklCOWc9PQ== # 来源于上面的命令输出.
  • 创建PodSecret

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    ~$ kubectl create -f secret/ceph-secret.yaml
    ~$ kubectl create -f rbd-with-secret

    ~$ kubectl get pods -o wide
    NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    rbd2 1/1 Running 0 60m 10.244.1.2 node2 <none> <none>
    ~$ kubectl get secret
    NAME TYPE DATA AGE
    ceph-secret kubernetes.io/rbd 1 17h

    # 这样就像使用本的盘一样了.
    ~$ kubectl exec -it rbd2 -- df -h | grep -e "rbd0" -e "secret"
    /dev/rbd0 3.9G 16.0M 3.8G 0% /mnt/rbd
    tmpfs 498.2M 12.0K 498.2M 0% /var/run/secrets/kubernetes.io/serviceaccount
  • 创建基于RBDPV以及PVC测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
~$ cat rbd-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: ceph-rbd-pv
spec:
capacity:
storage: 4Gi
accessModes:
- ReadWriteOnce
rbd:
monitors:
- '192.168.99.101:6789'
- '192.168.99.102:6789'
- '192.168.99.103:6789'
- '192.168.99.104:6789'
pool: kube
image: cephimage1
fsType: ext4
readOnly: false
user: admin
secretRef:
name: ceph-secret
persistentVolumeReclaimPolicy: Recycle

~$ cat rbd-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ceph-rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4Gi

~$ kubectl create -f rbd-pv.yaml
~$ kubectl create -f rbd-pvc.yaml
~$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
ceph-rbd-pv 4Gi RWO Recycle Bound default/ceph-rbd-pvc 17h
~$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ceph-rbd-pvc Bound ceph-rbd-pv 4Gi RWO 17h

Ceph-Ansible安装方式

关于新以太网名命方式

1
2
3
4
5
6
7
8
enp0s10:
| | |
v | |
en| | --> ethernet
v |
p0| --> bus number (0)
v
s10 --> slot number (10)
  • 如果不习惯新式的命名可以通过下面三方法改成旧式的命名方式

  • You basically have three options:

    1. You disable the assignment of fixed names, so that the unpredictable kernel names are used again. For this, simply mask udev’s .link file for the default policy: ln -s /dev/null /etc/systemd/network/99-default.link
    2. You create your own manual naming scheme, for example by naming your interfaces “internet0”, “dmz0” or “lan0”. For that create your own .link files in /etc/systemd/network/, that choose an explicit name or a better naming scheme for one, some, or all of your interfaces. See systemd.link(5) for more information.
    3. You pass the net.ifnames=0 on the kernel command line
  • 查看虚拟机

    1
    2
    3
    4
    5
    6
    7
    8
    9
    ~$ VBoxManage list vms
    "k8s-master" {7bfb1ca4-3ccc-4a1a-8548-7759424df181}
    "k8s-node1" {4c29c029-4f93-4463-b83d-4ae9e728e9df}
    "k8s-node2" {87a2196c-cf3c-472a-9ffa-f5b8c3e09009}
    "k8s-node3" {af9e34cf-a7c9-45d8-ad15-f37d409bcdac}
    "k8s-node4" {1f46e865-01c1-4a81-a947-cc267c744756}

    # 使用 VBoxHeadles启动上述虚拟机,它不会出现窗口.
    ~$ VBoxHeadless --startvm k8s-master
  • 下面是参照官网来安装ceph-deploy.但是使用apt找不到ceph-deploy包名.

    1
    2
    3
    4
    ~$ wget -q -O- 'https://download.ceph.com/keys/release.asc' | sudo apt-key add -
    # 用Ceph稳定版(如 cuttlefish 、 dumpling 、 emperor 、 firefly,nautilus 等等)替换掉 {ceph-stable-release}
    ~$ echo deb http://download.ceph.com/debian-{ceph-stable-release}/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list
    ~$ sudo apt-get update && sudo apt-get install ceph-deploy
  • 下面把它转换成Ansible playbook的方式来安装.在Github上的Ceph有一个ceph-ansible,没有用过,它的 Star 将近 1k 了.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
- name: 安装基础软件
hosts: all
become: yes
# user: root 这里可以直接用root,但是关闭root远程登录后要使用sudo.
tasks:
# 参考文档 https://docs.ansible.com/ansible/latest/modules/command_module.html#command-module
- name: 读取系统发行版本号
command: lsb_release -sc
register: result

- name: 安装公钥匙
apt_key:
url: https://download.ceph.com/keys/release.asc
state: present

# 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_repository_module.html?highlight=add%20apt%20repository
- name: ceph-deploy
apt_repository:
repo: deb http://download.ceph.com/debian-nautilus {{ result.stdout }} main
state: present
filename: ceph

# 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_key_module.html?highlight=apt%20key
- name: 添加docker-ce的公钥
apt_key:
url: https://download.docker.com/linux/debian/gpg
state: present

# 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_repository_module.html?highlight=add%20apt%20repository
- name: docker-ce
apt_repository:
repo: deb [arch=amd64] https://download.docker.com/linux/debian {{ result.stdout }} stable
state: present
filename: docker-ce

# 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_module.html
- name: 更新并安装
apt:
name:
['ntp', 'ntpdate', 'ntp-doc', 'docker-ce', 'bridge-utils', 'ipvsadm']
allow_unauthenticated: yes
update_cache: yes

# 参照文档 https://docs.ansible.com/ansible/latest/modules/lineinfile_module.html?highlight=sudoers
# 如查使用ansible的sysctl模块,可以参照这里 https://docs.ansible.com/ansible/latest/modules/sysctl_module.html?highlight=sysctl
- name: 更新并安装sysctl
lineinfile:
path: /etc/sysctl.d/80-k8s.conf
create: yes
line: '{{ item }}'
with_items:
- 'net.bridge.bridge-nf-call-ip6tables = 1'
- 'net.bridge.bridge-nf-call-iptables = 1'
- 'net.bridge.bridge-nf-call-arptables = 1'
- 'net.ipv4.ip_forward = 1'

- name: 更新sysctl
command: sysctl --system

- block:
# 命名方式参考这里https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/
# https://major.io/2015/08/21/understanding-systemds-predictable-network-device-names/
- name: 使用旧式网卡命名方式
file:
src: /dev/null
dest: /etc/systemd/network/99-default.link
state: link

# 好像上述改回成旧式命名的方法在debian里不成功.使用下面修改内核参数的方式.
- name: 更新内核参数
lineinfile:
path: /etc/default/grub
regexp: '^GRUB_CMDLINE_LINUX='
line: 'GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0"'

- name: 更新grub.cfg
command: grub-mkconfig -o /boot/grub/grub.cfg

安装KubernetesMaster

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
~$ sudo kubeadm init --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers  --kubernetes-version v1.14.1 --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=192.168.99.100
[...]
Your`Kubernetes`control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user: # 注意需按照下述步骤进行.先安装对应的网络插件再加入其它的节点

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.99.100:6443 --token ejtj7f.oth6on2k6y0qcj2k \
--discovery-token-ca-cert-hash sha256:d162721230250668a4296aca699867126314a9ecd2418f9c70110b6b02bd01de

# 继续安装网络插件.
~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml
  • 默认安装集群是使用kube-proxy+iptables模式,需要手动修改为ipvs模式.使用kubectl -n kube-system edit cm kube-proxy打开 ConfigMap 文件,把mode=""替换成mode="ipvs,再把旧的 pod 删掉,kubectl -n kube-system delete pod kube-proxy-xxx,它会再生成一个新的 pod.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
~$ kubectl -n kube-system logs kube-proxy-t27xd
I0514 06:33:30.681150 1 server_others.go:177] Using ipvs Proxier. ---> 切换成ipvs模式.
W0514 06:33:30.738710 1 proxier.go:381] IPVS scheduler not specified, use rr by default
I0514 06:33:30.747818 1 server.go:555] Version: v1.14.1
[...]

# 查看ipvs的列表.
~$ sudo ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.17.0.1:32047 rr
TCP 192.168.99.100:32047 rr
TCP 10.0.2.15:32047 rr
TCP 10.96.0.1:443 rr
-> 192.168.99.100:6443 Masq 1 3 0
[...]
  • 使用Ansible批量加入 k8s 集群.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    ~$ cat hosts
    [master]
    192.168.99.100
    [node1]
    192.168.99.101
    [node2]
    192.168.99.102
    [node3]
    192.168.99.103
    [node4]
    192.168.99.104
    [node]
    192.168.99.101
    192.168.99.102
    192.168.99.103
    192.168.99.104

    ~$ansible -i hosts node -b -m command -a "kubeadm join 192.168.99.100:6443 --token ejtj7f.oth6on2k6y0qcj2k --discovery-token-ca-cert-hash sha256:d162721230250668a4296aca699867126314a9ecd2418f9c70110b6b02bd01de"
  • 查看主节点的状态

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    ~$ kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    k8s-master NotReady master 15h v1.14.1
    node1 NotReady <none> 15h v1.14.1
    node2 NotReady <none> 15h v1.14.1
    node3 NotReady <none> 15h v1.14.1
    node4 NotReady <none> 15h v1.14.1

    # 查看所有节点为什么是NotReady状态?
    ~$ kubectl get pods -n kube-system
    NAME READY STATUS RESTARTS AGE
    coredns-d5947d4b-kfhlp 0/1 Pending 0 15h
    coredns-d5947d4b-sq95j 0/1 Pending 0 15h
    etcd-k8s-master 1/1 Running 2 15h
    kube-apiserver-k8s-master 1/1 Running 2 15h
    kube-controller-manager-k8s-master 1/1 Running 2 15h
    kube-proxy-25vgp 1/1 Running 2 15h
    kube-proxy-75xjc 1/1 Running 1 15h
    kube-proxy-bvdh6 1/1 Running 1 15h
    kube-proxy-lzp8m 1/1 Running 1 15h
    kube-proxy-wnmwk 1/1 Running 1 15h
    kube-scheduler-k8s-master 1/1 Running 2 15h

    # 查看coredns为什么Pending?
    ~$ kubectl describe pod coredns -n kube-system
    [...]
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Warning FailedScheduling 10m (x49 over 81m) default-scheduler 0/5 nodes are available: 5 node(s) had taints that the pod didn\'t tolerate.
    Warning FailedScheduling 75s (x4 over 5m21s) default-scheduler 0/5 nodes are available: 5 node(s) had taints that the pod didn\'t tolerate.

    # 查看系统 journalctl
    ~$ sudo journalctl -u kubelet
    # 发现是因为没有安装网络插件的原因.
    ~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml

使用Rook构建

  • Ceph Rook集成
    CephRook
  • Rook 的架构
    rook architecture

安装Ceph

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
~$ git clone https://github.com/rook/rook
~$ cd rook/cluster/examples/kubernetes/ceph/
~$ kubectl create -f common.yaml
~$ kubectl create -f operator.yaml
~$ kubectl create -f cluster.yaml
~$ kubectl -n rook-ceph get pods
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-f7ln5 1/1 Running 0 5m36s
rook-ceph-agent-fzztf 1/1 Running 0 5m36s
rook-ceph-agent-mgqk6 1/1 Running 0 5m36s
rook-ceph-agent-qdbmh 1/1 Running 0 5m36s
rook-ceph-agent-twsvp 1/1 Running 0 5m36s
rook-ceph-operator-775cf575c5-8k44f 1/1 Running 1 6m30s
rook-discover-d4btd 1/1 Running 0 5m36s
rook-discover-fbq9w 1/1 Running 0 5m36s
rook-discover-gcksv 1/1 Running 0 5m36s
rook-discover-hnbdj 1/1 Running 0 5m36s
rook-discover-j5x5h 1/1 Running 0 5m36s

拆除ROOK

1
2
3
4
5
6
7
8
9
10
11
12
~$ cat remove-nodes-rooks-containers.sh
for i in `seq 1 4`; do
for n in `ansible -i hosts node$i -b -m command -a "docker ps -a" | awk 'NR>2 {print $1}'`;do
#ansible -i hosts node$i -b -m command -a "docker stop $n ; docker rm $n";
ansible -i hosts node$i -b -m command -a "docker rm $n";
done;
done

~$ cat remove-rook-cluster-data.sh
ansible -i hosts all -b -m file -a "path=/var/lib/rook state=absent"
ansible -i hosts all -b -m file -a "path=/etc/kubernetes state=absent"
ansible -i hosts all -b -m file -a "path=/var/lib/kubelet state=absent"

错误

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
~$ kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
rook-ceph-mon-a-f799d9cf6-xrg8f 0/1 Init:CrashLoopBackOff 6 8m46s
rook-ceph-mon-d-5dd7b4d56f-wwg8n 0/1 Init:CrashLoopBackOff 6 7m1s
rook-ceph-mon-f-7977bd98c9-9b6h4 0/1 Init:CrashLoopBackOff 5 5m19s

~$ kubectl -n rook-ceph describe pod rook-ceph-mon-a
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9m15s default-scheduler Successfully assigned rook-ceph/rook-ceph-mon-a-f799d9cf6-xrg8f to k8s-master
Normal Pulled 7m20s (x5 over 9m4s) kubelet, k8s-master Container image "rook/ceph:v0.9.3" already present on machine
Normal Created 7m19s (x5 over 9m2s) kubelet, k8s-master Created container config-init
Normal Started 7m18s (x5 over 8m59s) kubelet, k8s-master Started container config-init
Warning BackOff 3m52s (x26 over 8m52s) kubelet, k8s-master Back-off restarting failed container

~$ kubectl -n rook-ceph describe pod rook-ceph-mon-d
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m2s default-scheduler Successfully assigned rook-ceph/rook-ceph-mon-d-5dd7b4d56f-wwg8n to node1
Normal Pulled 6m15s (x5 over 7m45s) kubelet, node1 Container image "rook/ceph:v0.9.3" already present on machine
Normal Created 6m15s (x5 over 7m45s) kubelet, node1 Created container config-init
Normal Started 6m14s (x5 over 7m45s) kubelet, node1 Started container config-init
Warning BackOff 2m41s (x26 over 7m43s) kubelet, node1 Back-off restarting failed container

谢谢支持

  • 微信二维码:

PaaS概述

  • PaaS(Platform as a service),平台即服务,指将软件研发的平台(或业务基础平台)作为一种服务,以SaaS的模式提交给用户.PaaS是云计算服务的其中一种模式,云计算是一种按使用量付费的模式的服务,类似一种租赁服务,服务可以是基础设施计算资源(IaaS),平台(PaaS),软件(SaaS).租用IT资源的方式来实现业务需要,如同水力、电力资源一样,计算、存储、网络将成为企业IT运行的一种被使用的资源,无需自己建设,可按需获得.PaaS的实质是将互联网的资源服务化为可编程接口,为第三方开发者提供有商业价值的资源和服务平台.简而言之,IaaS就是卖硬件及计算资源,PaaS就是卖开发、运行环境,SaaS就是卖软件.
类型 说明 比喻 例子
IaaS:Infrastructure-as-a-Service(基础设施即服务) 提供的服务是计算基础设施 地皮,需要自己盖房子 Amazon EC2(亚马逊弹性云计算),阿里云
PaaS: Platform-as-a-Service(平台即服务) 提供的服务是软件研发的平台或业务基础平台 商品房,需要自己装修 GAE(谷歌开发者平台),heroku
SaaS: Software-as-a-Service(软件即服务) 提供的服务是运行在云计算基础设施上的应用程序 酒店套房,可以直接入住 谷歌的 Gmail 邮箱

Kubernetes概述

  • What is Kubernetes?
  • KubernetesGoogle开源的容器集群管理系统.它构建Docker技术之上,为容器化的应用提供资源调度、部署运行、服务发现、扩容缩容等整一套功能,本质上可看作是基于容器技术的Micro-PaaS平台,即第三代PaaS的代表性项目.
  • Kubernetes架构图
    Kubernetes

Kubernetes的基本概念

Pod

  • Pod是若干个相关容器的组合,是一个逻辑概念,Pod包含的容器运行在同一个宿主机上,这些容器使用相同的网络命名空间,IP地址和端口,相互之间能通过localhost来发现和通信,共享一块存储卷空间.在Kubernetes中创建、调度和管理的最小单位是Pod.一个Pod一般只放一个业务容器和一个用于统一网络管理的网络容器.

Replication Controller

  • Replication Controller是用来控制管理Pod副本(Replica,或者称实例),Replication Controller确保任何时候Kubernetes集群中有指定数量的Pod副本在运行,如果少于指定数量的Pod副本,Replication Controller会启动新的Pod副本,反之会杀死多余的以保证数量不变.另外Replication Controller是弹性伸缩、滚动升级的实现核心.

Deployment

  • Deployment是一种更加简单的更新RCPod的机制.通过在Deployment中描述期望的集群状态,Deployment Controller会将现在的集群状态在一个可控的速度下渐渐更新成期望的集群状态.Deployment的主要的职责同样是为了保证Pod的数量和健康,而且绝大多数的功能与Replication Controller完全一样,因些可以被看作新一代的RC的超集.它的特性有:事件和状态查看,回滚,版本记录,暂停和启动,多种升级方案:Recreate,RollingUpdate.

Job

  • 从程序的运行形态上来区分,我们可以将Pod分成两类:长期运行服务(jboss,mysql,nginx等)和一次性任务(如并行数据计算,测试).RC创建的Pod是长时运行的服务,而Job创建的Pod的都是一次性任务.

StatefulSet

  • StatefulSet是在与有状态的应用及分布式系统一起使用的.StatefulSet使用起来相对复杂,当应用具有以下特点时才使用它.
    • 有唯一的稳定网络标识符需求.
    • 有稳定性,持久化数据存储需求.
    • 有序的部署和扩展需求.
    • 有序的删除和终止需求.
    • 有序的自动滚动更新需求.

Service

  • Kubernetes Service文档
  • Service是真实应用服务的抽象,定义了Pod的逻辑集合和访问这个Pod集合的策略,Service将代理Pod对外表现为一个单一访问接口,外部不需要了解后端Pod如何运行,这给扩展或维护带来很大的好处,提供了一套简化的服务代理和发现机制.Service共有四种类型:
    • ClusterIP: 通过集群内部的IP地址暴露服务,此地址仅在集群内部可达,而无法被集群外部的客户端访问,此为默认的Service类型.
    • NodePort: 这种类型建立在ClusterIP类型之止,其在每个节点上的IP地址的某静态端口(NodePort)暴露服务,因此,它依然会为Service分配集群IP地址,并将此作为NodePort的路由目标.
    • LoadBalancer: 这种类型建构在NodePort类型之上,其通过cloud provider提供的负载均衡器将服务暴露到集群外部.因此LoadBalancer一样具有NodePortClusterIP.(目前只有云服务商才可支持,如果是用VirtualBox做实验,只能是ClusterIPNodePort)
    • ExternalName: 其通过将Service映射到由externalName字段的内容指定的主机名来暴露服务,此主机名需要被DNS服务解析到CNAME类型的记录.

Ingress

  • Ingrees资源,它实现的是”HTTP(S)负载均衡器“,它是k8s API的标准资源类型之一,它其实就是基于DNS名称或URL路径把请求转发到指定的Service资源的规则,用于将集群外部的请求流量转发到集群内部完成服务发布,Ingress资源自身并不能进行流量穿透,它仅是一组路由规则的集合. 不同于Deployment控制器等,Ingress控制器并不直接运行为kube-controller-manager的一部分,它是k8s集群的一个重要附件,类似于CoreDNS,需要在集群上单独部署.

Label

  • Label是用于区分Pod、Service、Replication ControllerKey/Value键值对,实际上Kubernetes中的任意API对象都可以通过Label进行标识.每个API对象可以有多个Label,但是每个LabelKey只能对应一个Value.LabelServiceReplication Controller运行的基础,它们都通过Label来关联Pod,相比于强绑定模型,这是一种非常好的松耦合关系.

Node

  • Kubernets属于主从的分布式集群架构,Kubernets Node(简称为Node,早期版本叫做Minion)运行并管理容器.Node作为Kubernetes的操作单元,将用来分配给Pod(或者说容器)进行绑定,Pod最终运行在Node上,Node可以认为是Pod的宿主机.

Kubernetes架构

Master节点

  • Masterk8s集群的大脑,运行服务有:
    • kube-apiserver
    • kube-scheduler
    • kube-controller-manager
    • etcd,Pod 网络(如 Flannel,Canal)
API Server(kube-apiserver)
  • Api Serverk8s集群的前端接口,各种工具可以通过它管理Cluster的各种资源.
Scheduler(kube-scheduler)
  • Scheduler负责决定将Pod放在那个Node上运行.它调试时会充分考虑Cluster 的拓扑结构,找到一个最优方案.
Controller Manager(kube-controller-manager)
  • Controller Manager负责管理Cluster 各种资源.保证资源的处于预期的状态.它是由多种controller 组成的.
etcd
  • etcd负责保存k8s的配置信息和各种资源的状态信息.当数据发生变化时,etcd会快速地通知k8s相关组件.
Pod网络

-Pod要能够相互通信,k8s必须部署Pod网络,flannel 是其中一个可选的方案.

Node节点

  • NodePod运行的地方,k8s 支持Docker,rkt等容器的Runtime.它上面运行的组件有kubelet,kube-proxy,Pod网络.
kubelet
  • kubeletNodeagent,当Scheduler确定在某个Node上运行Pod后,会将Pod的具体配置信息(image,volume)发给该节点的kubelet.kubelet根据这该信息创建各运行容器,并向master报告状态.
kube-proxy
  • service在逻辑上代表了后端的多个Pod,外界通过service访问Pod.service 接收到的请求是如何转发到相应的Pod.如有多个副本,kube-proxy会实现负载均衡.
Secret & ConfigMap
  • Secret可以为Pod提供密码,Token,私钥等敏感数据;对于一些非敏感的数据,比如应用的配置信息,可以使用ConfigMap.

插件

网络相关
  • ACI provides integrated container networking and network security with Cisco ACI.
  • Calico is a secure L3 networking and network policy provider.
  • Canal unites Flannel and Calico, providing networking and network policy.
  • Cilium is a L3 network and network policy plugin that can enforce HTTP/API/L7 policies transparently. Both routing and overlay/encapsulation mode are supported.
  • CNI-Genie enables Kubernetes to seamlessly connect to a choice of CNI plugins, such as Calico, Canal, Flannel, Romana, or Weave.
  • Contiv provides configurable networking (native L3 using BGP, overlay using vxlan, classic L2, and Cisco-SDN/ACI) for various use cases and a rich policy framework. Contiv project is fully open sourced. The installer provides both kubeadm and non-kubeadm based installation options.
  • Contrail, based on Tungsten Fabric, is a open source, multi-cloud network virtualization and policy management platform. Contrail and Tungsten Fabric are integrated with orchestration systems such as Kubernetes, OpenShift, OpenStack and Mesos, and provide isolation modes for virtual machines, containers/pods and bare metal workloads.
  • Flannel is an overlay network provider that can be used with Kubernetes.
  • Knitter is a network solution supporting multiple networking in Kubernetes.
  • Multus is a Multi plugin for multiple network support in Kubernetes to support all CNI plugins (e.g. Calico, Cilium, Contiv, Flannel), in addition to SRIOV, DPDK, OVS-DPDK and VPP based workloads in Kubernetes.
  • NSX-T Container Plug-in (NCP) provides integration between VMware NSX-T and container orchestrators such as Kubernetes, as well as integration between NSX-T and \ container-based CaaS/PaaS platforms such as Pivotal Container Service (PKS) and OpenShift.
  • Nuage is an SDN platform that provides policy-based networking between Kubernetes Pods and non-Kubernetes environments with visibility and security monitoring.
  • Romana is a Layer 3 networking solution forPodnetworks that also supports the NetworkPolicy API. Kubeadm add-on installation details available here.
  • Weave Net provides networking and network policy, will carry on working on both sides of a network partition, and does not require an external database.
服务发现
  • CoreDNS is a flexible, extensible DNS server which can be installed as the in-cluster DNS for pods.
可视化面板
  • Dashboard is a dashboard web interface for Kubernetes.
  • Weave Scope is a tool for graphically visualizing your containers, pods, services etc. Use it in conjunction with a Weave Cloud account or host the UI yourself.

安装Minikube

  • 使用Minikube 是运行 Kubernetes 集群最简单,最快捷的途路径.Minikube是一个构建单节点集群的工具,对于测试Kubernetes和本地开发应用都非常有用,这里在Debian下安装,它默认会要使用到VirtualBox虚拟机为驱动,也可以安装kvm2为驱动.Minikube的参考文档docker-machine的参数.
  • 第一次使用minikube start,它会在创建一个~/.minikube目录,会下载minikube-vxxx.iso~/.minikube/cache/下面,如果下载十分缓慢,可以手动https://storage.googleapis.com/minikube/iso/minikube-v0.35.0.iso下载复制到目录下.
1
2
3
4
5
6
~$ curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.35.0/minikube-linux-amd64 && chmod +x minikube && sudo cp minikube /usr/local/bin/ && rm minikube

# 也可以从这里https://storage.googleapis.com/minikube/releases/v0.35.0/minikube-linux-amd64下载.
# 下载kvm2驱动
~$ wget https://github.com/kubernetes/minikube/releases/download/v0.35.0/docker-machine-driver-kvm2
~$ chmod +x docker-machine-driver-kvm2 && mv docker-machine-driver-kvm2 /usr/local/bin && rm minikube
  • 启动一个Minikube虚拟机,如果直接minikube start就会默认使用VirtualBox启动.网络部分使用default.--kvm-network的参数,来源于virsh net-list.minikube-net是一个隔离的网络,也不是说不能联网的,如果需要连网,要与本机的网卡或者网络做桥接或NAT.
  • k8s.gcr.io在国内是无法直接访问的,所以会造成minikube无法拖取相关的镜像,最终导致minikube无法正常使用.在此有几个方法可以变通一下,最简单的方法是使用代理:
    1
    --docker-env HTTP_PROXY=<ip:port> --docker-env HTTPS_PROXY=<ip:port> --docker-env NO_PROXY=127.0.0.1,localhost`
  • 如果服务器可以访问外网,则可在docker daemon的启动参数(/etc/sysconfig/docker)中OPTIONS
    加上--insecure-registry k8s.gcr.io
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
~$  minikube start --vm-driver=kvm2 --kvm-network minikube-net --registry-mirror=https://registry.docker-cn.com --kubernetes-version v1.14.0

# 从阿里下载过来,最好是可以使用代理直接从k8s.gcr.io下载.
~$ docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.14.0
~$ docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.14.0
~$ docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.14.0
~$ docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.14.0
~$ docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.3.1
~$ docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.10
~$ docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1

# 使用脚本批量把它们换tag,也可以把它推送到自己公司的私有仓库中去,测试使用重写TAG的方法好像不行,还是要用代理才能正常下载.
~$ docker images | grep "aliyuncs.com" | awk '{split($1,a,"/"); print "docker tag " $1":"$2 " k8s.gcr.io/"a[3]":"$2}'
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.14.0 k8s.gcr.io/kube-proxy:v1.14.0
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.14.0 k8s.gcr.io/kube-apiserver:v1.14.0
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.14.0 k8s.gcr.io/kube-scheduler:v1.14.0
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.14.0 k8s.gcr.io/kube-controller-manager:v1.14.0
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.10 k8s.gcr.io/etcd:3.3.10
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1

# 再用手动初始化这些image到容器
~$ sudo kubeadm init --kubernetes-version=v1.14.0
  • 经过测试直接使用kubeadm也可以直接拉取其它的国内镜像进行安装,使用代理也可以安装,考虑到找到一个稳定可靠的代理还是有一些难度.因为minikube start --vm-driver=kvm2是直接在创建一个虚拟机,并通过sudo kubeadm config images pull --config /var/lib/kubeadm.yaml在虚拟机拉取相应的docker镜像组件到本地部署各种k8s的服务.它默认是使用https://k8s.gcr.io/v2/这域名去拉取镜像,

    1
    ~$ minikube start --vm-driver=kvm2 --kvm-network minikube-net --registry-mirror=https://registry.docker-cn.com
  • 发现参数--registry-mirror并没有起作用,还是会报错如下:

    1
    2
    3
    4
     Unable to pull images, which may be OK: running cmd: sudo kubeadm config images pull --config /var/lib/kubeadm.yaml: command failed: sudo kubeadm config images pull --config /var/lib/kubeadm.yaml
    stdout:
    stderr: failed to pull image "k8s.gcr.io/kube-apiserver:v1.14.0": output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

  • 下载kubectl到本地开发机器(控制端)

    1
    2
    3
    4
    5
    6
    7
    8
    ~$ curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
    ~$ chmod +x kubectl && sudo mv kubectl /usr/local/bin

    # 加入自动补全功能
    ~$ echo "source <(kubectl completion bash)" >> ~/.bashrc
    ~$ kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    minikube Ready <none> 94m v1.13.4

源码编译Minikube

  • 下载最新GO语言编译器

    1
    2
    3
    ~$ wget -c https://golang.google.cn/doc/install?download=go1.12.4.linux-amd64.tar.gz
    ~$ sudo tar xvf go1.12.4.linux-amd64.tar.gz -C /opt/
    ~$ export PATH=/opt/go/bin:$PATH
  • 下载源码,要先创建/opt/go/src/k8s.io目录,在该目录下克隆代码.并且修改镜像地址

    1
    2
    3
    4
    ~$ sudo mkdir /opt/go/src/k8s.io && cd /opt/go/src/k8s.io &&   git clone https://github.com/kubernetes/minikube.git
    ~$ cd minikube && for item in `grep -l "k8s.gcr.io" -r *`;do sed -i "s#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" $item ;done
    ~$ make
    ~$ sudo cp out/minikube-linux-amd64 /usr/local/bin/minikube
  • 综上所述,因为使用docker tag还是有问题,源码构建的Minikube可以完美解决墙的问题,如果去网上下载第三方的Minikube二进制怕有夹带私货的问题.

错误处理

1
2
3
4
⌛  Waiting for pods: apiserver proxy💣  Error restarting cluster: wait: waiting for k8s-app=kube-proxy: timed out waiting for the condition

😿 Sorry that minikube crashed. If this was unexpected, we would love to hear from you:
👉 https://github.com/kubernetes/minikube/issues/new

手动布署安装Kubernetes组件

Debian,Ubuntu发行版安装kubeadm

  • kubeadm: the command to bootstrap the cluster.
  • kubelet: the component that runs on all of the machines in your cluster and does things like starting pods and containers.
  • kubectl: the command line util to talk to your cluster.
  • 中文
    • kubeadm: 用来初始化集群的指令.
    • kubelet: 在集群中的每个节点上用来启动Podcontainer等.
    • kubectl: 用来与集群通信的命令行工具.
  • 这三个组件的安装时候要注意它们之间的版本兼容性问题.

官方软件仓库,在国内不能使用

1
2
3
4
5
6
7
8
9
10
11
~$ apt-get update && apt-get install -y apt-transport-https curl dirmngr
~$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
# 或者用如下的方式安装公钥,需要依赖安装dirmngr
~$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 6A030B21BA07F4FB
~$ sudo bash -c "cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF"
# 经测试 apt.kubernetes.io 重定向到k8s.io不能访问,可以使用国内镜像 http://mirrors.ustc.edu.cn/kubernetes/
~$ apt-get update
~$ apt-get install -y kubelet kubeadm kubectl
~$ apt-mark hold kubelet kubeadm kubectl

阿里云镜像仓库

1
2
3
4
5
6
7
8
9
10
11
~$ apt-get update && apt-get install -y apt-transport-https
~$ curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
~$ sudo bash -c "cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF"
~$ sudo apt-get update && sudo apt-get install kubelet kubeadm kubectl -y
~$ sudo apt-mark hold kubelet kubeadm kubectl

# 也可以直接使用curl下载二进制 ,要用代理.
~$ curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl

  • 查询当前版本需要那些docker image.
1
2
3
4
5
6
7
8
~$ kubeadm config images list --kubernetes-version v1.14.0
k8s.gcr.io/kube-apiserver:v1.14.0
k8s.gcr.io/kube-controller-manager:v1.14.0
k8s.gcr.io/kube-scheduler:v1.14.0
k8s.gcr.io/kube-proxy:v1.14.0
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.10
k8s.gcr.io/coredns:1.3.1

安装Master节点

网络模型

  • 目前Kubernetes支持多种网络方案,如: Flannel,Canal,Weave Net,Calico等.它都是实现了CNI的规范.
    下面以安装Canal为示例.其它各安装参考这里的 Installing aPodnetwork add-on 部分,中文文档,在kubeadm init是必须要指定一个网络,不然会出现其它问题.根据上述链接指导如:
    • Calico 网络模型:kubeadm init --pod-network-cidr=192.168.0.0/16,它只工在amd64,arm64,ppc64le三个平台.
    • Flannel 网络模型:kubeadm init --pod-network-cidr=10.244.0.0/16,并修改内核参数
      sysctl net.bridge.bridge-nf-call-iptables=1,工作在Linux的amd64,arm,arm64,ppc64le,s390x
1
2
3
4
5
6
7
8
~$ kubectl apply -f \
https://docs.projectcalico.org/v3.6/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
[...]
serviceaccount/calico-kube-controllers created
  • 安装Master节点,注意在国内必须指定--image-repository,默认的k8s.gcr.io是不能直接访问的,还有--kubernetes-version必须与kubelet的组件版本匹配.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
~$ sudo kubeadm init --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers  --kubernetes-version v1.14.0 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[...]
# 安装成功后,注意下面提示的操作项流程依次进行.
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a`Pod`network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.18.127.186:6443 --token z8r97j.3ovdfddb6df9lnq7 \
--discovery-token-ca-cert-hash sha256:07767a67fa6c38feda7471ee5e1a15a0a9c417cfdf6cf457ff577297f22d9415
  • 根据kubeadm init的参数,安装Flannel网络插件
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    ~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml
    clusterrole.rbac.authorization.k8s.io/flannel created
    clusterrolebinding.rbac.authorization.k8s.io/flannel created
    serviceaccount/flannel created
    configmap/kube-flannel-cfg created
    daemonset.extensions/kube-flannel-ds-amd64 created
    daemonset.extensions/kube-flannel-ds-arm64 created
    daemonset.extensions/kube-flannel-ds-arm created
    daemonset.extensions/kube-flannel-ds-ppc64le created
    daemonset.extensions/kube-flannel-ds-s390x created
  • 按照上述安装成功后的提示,配置Master节点信息.并安装网络插件请参考这里.
1
2
3
4
5
6
7
8
~$ mkdir -p $HOME/.kube
~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
~$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

# 查看token
~$ kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
5smm64.9zpyhaqxghohh6b2 23h 2019-04-20T14:45:14+08:00 authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token

修改Kubelet的启动参数

  • kubelet组件是通过systemctl来管理的,因此可以在每个节点里的/etc/systemed/system找到它的配置文件.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    ~# cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
    # Note: This dropin only works with kubeadm and kubelet v1.11+
    [Service]
    Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
    Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
    # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
    EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
    # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
    # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
    EnvironmentFile=-/etc/default/kubelet
    ExecStart=
    ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

安装集群节点

  • 在另一个机器里安装上述的三个组件,就可以运行下面命令加入k8s集群管理了.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    ~$ sudo kubeadm join 172.18.127.186:6443 --token z8r97j.3ovdfddb6df9lnq7     --discovery-token-ca-cert-hash sha256:07767a67fa6c38feda7471ee5e1a15a0a9c417cfdf6cf457ff577297f22d9415
    [preflight] Running pre-flight checks
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
    [preflight] Reading configuration from the cluster...
    [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
    [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
    [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
    [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
    [kubelet-start] Activating the kubelet service
    [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

    This node has joined the cluster:
    * Certificate signing request was sent to apiserver and a response was received.
    * The Kubelet was informed of the new secure connection details.

    Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
  • 在 Master 节点上查看集群节点数.
1
2
3
4
5
~$ kubectl  get node
NAME STATUS ROLES AGE VERSION
aliyun-machine Ready master 110m v1.14.0
dig001 Ready <none> 84m v1.14.0
fe001 Ready <none> 2m8s v1.14.0
  • 查看集群的要完整架构,Master 上也可以运行应用,即 Master 同时也是一个 Node.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    ~$ kubectl get pod --all-namespaces -o wide
    NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    kube-system coredns-d5947d4b-sr5zt 1/1 Running 0 6m3s 10.244.0.6 k8s-master <none> <none>
    kube-system coredns-d5947d4b-tznh2 1/1 Running 0 6m3s 10.244.0.5 k8s-master <none> <none>
    kube-system etcd-k8s-master 1/1 Running 0 5m11s 172.18.127.186 k8s-master <none> <none>
    kube-system kube-apiserver-k8s-master 1/1 Running 0 5m15s 172.18.127.186 k8s-master <none> <none>
    kube-system kube-controller-manager-k8s-master 1/1 Running 0 5m13s 172.18.127.186 k8s-master <none> <none>
    kube-system kube-flannel-ds-amd64-9d965 1/1 Running 0 5m17s 172.18.127.186 k8s-master <none> <none>
    kube-system kube-flannel-ds-amd64-c8dkh 1/1 Running 0 38s 172.18.192.76 dig001 <none> <none>
    kube-system kube-flannel-ds-amd64-kswj2 1/1 Running 0 52s 172.18.253.222 fe001 <none> <none>
    kube-system kube-proxy-5g9vp 1/1 Running 0 38s 172.18.192.76 dig001 <none> <none>
    kube-system kube-proxy-cqzfl 1/1 Running 0 52s 172.18.253.222 fe001 <none> <none>
    kube-system kube-proxy-pjbbg 1/1 Running 0 6m3s 172.18.127.186 k8s-master <none> <none>
    kube-system kube-scheduler-k8s-master 1/1 Running 0 5m19s 172.18.127.186 k8s-master <none> <none>
  • 获取Pod的完整信息
    1
    2
    ~$ kubectl get pod <podname> --output json   # 用JSON格式显示Pod的完整信息
    ~$ kubectl get pod <podname> --output yaml # 用YAML格式显示Pod的完整信息

拆除k8s集群

1
2
3
~$ kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
~$ kubectl delete node <node name>
~$ kubeadm reset

安装HelloWorld共享Pod数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: v1 #  for k8s versions before 1.9.0 use apps/v1beta2  and before 1.8.0 use extensions/v1beta1
kind: Pod
metadata:
name: hello-world
spec:
restartPolicy: Never
containers:
- name: write
image: debian:latest
volumeMounts:
- name: data
mountPath: /data
command: ["bash","-c","echo \"Hello World\" >> /data/hello"]
- name: read
image: debian:latest
volumeMounts:
- name: data
mountPath: /data
command: ["bash","-c","sleep 10; cat /data/hello"]

volumes:
- name: data
hostPath:
path: /tmp
1
2
~$ kubectl apply -f hello-world.yaml

通过Github安装Kubernetes

Etcd

1
2
3
~$ wget -c https://github.com/etcd-io/etcd/releases/download/v3.3.12/etcd-v3.3.12-linux-amd64.tar.gz
~$ tar zxvf etcd-v3.3.12-linux-amd64.tar.gz
~$ cd etcd-v3.3.12-linux-amd64 && sudo cp {etcd,etcdctl} /usr/local/bin
  • 运行Etcd

    1
    2
    ~$ etcd -name etcd --data-dir /var/lib/etcd -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
    -advertise-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 >> /var/log/etcd.log 2>&1 &
  • 查询它的建康状态

    1
    2
    ~$ etcdctl  -C http://127.0.0.1:4001 cluster-healthmember 8e9e05c52164694d is healthy: got healthy result from http://0.0.0.0:2379
    cluster is healthy

Kubernetes发布包安装

  • Kubernetes

  • 通过github 下载最新的版本.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    ~$ wget -c https://github.com/kubernetes/kubernetes/releases/download/v1.14.0/kubernetes.tar.gz
    ~$ tar xvf kubernetes.tar.gz && cd kubernetes
    ~$ tree -L 1
    .
    ├── client
    ├── cluster
    ├── docs
    ├── hack
    ├── LICENSES
    ├── README.md
    ├── server
    └── version

    $ cat server/README
    Server binary tarballs are no longer included in the Kubernetes final tarball.

    Run cluster/get-kube-binaries.sh to download client and server binaries.
  • 根据上面的README提示,服务端的组件没有包含在上面的压缩包,而是要通过运行cluster/get-kube-binaries.shhttps://dl.k8s.io/v1.14.0下载.但是dl.k8s.io在国内是不能直接访问的.

安装Minio服务(单节点服务)

MinIO服务端

创建PV (Persistent Volume)

  • Kubernetes环境中,可以使用MinIO Kubernetes Operator
  • 下面是一个资源描述文件,创建一个10G大小的,本地类型的PV. PV可以理解成k8s集群中的某个网络存储中对应的一块存储,它与Voleme很类似.PV只能是网络存储,不属于任何的Node,但可以在每个Node上访问它.PV并不是定义在Pod上的,而是独立于Pod之外的定义.PV目前支持的类型包括:
    • GCEPersistentDisk,
    • AWSElasticBlockStore,
    • AzureFile,FC(Fibre Channel),
    • NFS,
    • iSCSI,
    • RBD(Rados Block Device),
    • CephFS,
    • GlusterFS,
    • HostPath(仅供单机测试)等等.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
~$ cat pv.yaml

kind: PersistentVolume
apiVersion: v1
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: standard
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/tmp/data"

~$ kubectl create -f pv.yaml

安装Minio PVC (Persistent Volume Claim)

  • PVC指定所需要的存储大小,然后k8s会选择满足条件的PV进行绑定,如果PVC创建之后没绑定到PV,就会出现Pending的错误,所以要按照PV->PVC->Deployment->Service这个顺序来试验.PVC的几种状态有:
    • Available: 空闲状态
    • Bound: 已经绑定到某个PVC上.
    • Released: 对应的PVC已经删除,但资源还没有被集群收回.
    • Failed: PV 自动回收失败.
1
2
~$ kubectl create -f https://github.com/minio/minio/blob/master/docs/orchestration/kubernetes/minio-standalone-pvc.yaml?raw=true
persistentvolumeclaim/minio-pv-claim created
  • 也可以把上面这个链接的文件下载到本地,修改

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    ~$ cat minio-pvc.yaml
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    # This name uniquely identifies the PVC. This is used in deployment.
    name: minio-pv-claim
    spec:
    # Read more about access modes here: http://kubernetes.io/docs/user-guide/persistent-volumes/#access-modes
    storageClassName: standard
    accessModes:
    # The volume is mounted as read-write by a single node
    - ReadWriteOnce
    resources:
    # This is the request for storage. Should be available in the cluster.
    requests:
    storage: 10Gi
  • 查看系统中的PVC状态,下面显示状态是Pending,使用describe查看它的详情.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    ~$ kubectl get pvc  --namespace default
    NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
    minio-pv-claim Pending
    ~$ kubectl get pvc --namespace default
    NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
    minio-pv-claim Pending 2m22s
    lcy@k8s-master:~$ kubectl describe pvc minio-pv-claim
    Name: minio-pv-claim
    Namespace: default
    StorageClass:
    Status: Pending
    Volume:
    Labels: <none>
    Annotations: <none>
    Finalizers: [kubernetes.io/pvc-protection]
    Capacity:
    Access Modes:
    VolumeMode: Filesystem
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal FailedBinding 4s (x14 over 3m2s) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
    Mounted By: minio-6d4d48db87-wxr4d
  • 根据上面Events显示错误如下:

    1
    FailedBinding  4s (x14 over 3m2s)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set`
  • 这就是它为什么Pending的原因.继续下一步,处理这些依赖的错误.

安装Minio Deployment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
~$ kubectl create -f https://github.com/minio/minio/blob/master/docs/orchestration/kubernetes/minio-standalone-deployment.yaml?raw=true
deployment.extensions/minio created

~$ cat minio-standalone-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
# This name uniquely identifies the Deployment
name: minio
spec:
strategy:
# Specifies the strategy used to replace old Pods by new ones
# Refer: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
type: Recreate
template:
metadata:
labels:
# This label is used as a selector in Service definition
app: minio
spec:
# Volumes used by this deployment
volumes:
- name: data
# This volume is based on PVC
persistentVolumeClaim:
# Name of the PVC created earlier
claimName: minio-pv-claim
containers:
- name: minio
# Volume mounts for this container
volumeMounts:
# Volume 'data' is mounted to path '/data'
- name: data
mountPath: "/data"
# Pulls the lastest Minio image from Docker Hub
image: minio/minio:RELEASE.2019-04-18T21-44-59Z
args:
- server
- /data
env:
# MinIO access key and secret key
- name: MINIO_ACCESS_KEY
value: "minio"
- name: MINIO_SECRET_KEY
value: "minio123"
ports:
- containerPort: 9000
# Readiness probe detects situations when MinIO server instance
# is not ready to accept traffic. Kubernetes doesn't forward
# traffic to the pod while readiness checks fail.
readinessProbe:
httpGet:
path: /minio/health/ready
port: 9000
initialDelaySeconds: 120
periodSeconds: 20
# Liveness probe detects situations where MinIO server instance
# is not working properly and needs restart. Kubernetes automatically
# restarts the pods if liveness checks fail.
livenessProbe:
httpGet:
path: /minio/health/live
port: 9000
initialDelaySeconds: 120
periodSeconds: 20
~$ kubectl get deployment --namespace default
NAME READY UP-TO-DATE AVAILABLE AGE
minio 0/1 1 0 82s

安装Minio Service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
~$ kubectl create -f https://github.com/minio/minio/blob/master/docs/orchestration/kubernetes/minio-standalone-service.yaml?raw=true
service/minio-service created

~$ cat minio-standalone-service.yaml
apiVersion: v1
kind: Service
metadata:
# This name uniquely identifies the service
name: minio-service
spec:
type: LoadBalancer
ports:
- port: 9000
targetPort: 9000
protocol: TCP
selector:
# Looks for labels `app:minio` in the namespace and applies the spec
app: minio

~$ kubectl get svc minio-service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
minio-service LoadBalancer 10.100.57.156 <pending> 9000:32552/TCP 10m
# 查看为什么会Pending
~$ kubectl describe pod --namespace default -l app=minio
Name: minio-756cb7dff7-mcm6m
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: fe001/172.18.253.222
Start Time: Fri, 19 Apr 2019 15:20:30 +0800
Labels: app=minio
pod-template-hash=756cb7dff7
Annotations: <none>
Status: Running
IP: 10.244.1.4
Controlled By: ReplicaSet/minio-756cb7dff7
Containers:
minio:
Container ID: docker://119535fa5ab172b5b2155c650dc51c2d12b3c02b1e28ab9e8301eb318ab969a7
Image: minio/minio:RELEASE.2019-04-18T21-44-59Z
Image ID: docker-pullable://minio/minio@sha256:a26e089732b85f8c312ff6346498acec763033b1ac85e74fc897f667939ea2aa
Port: 9000/TCP
Host Port: 0/TCP
Args:
server
/data
State: Running
Started: Fri, 19 Apr 2019 15:20:51 +0800
Ready: True
Restart Count: 0
Liveness: http-get http://:9000/minio/health/live delay=120s timeout=1s period=20s #success=1 #failure=3
Readiness: http-get http://:9000/minio/health/ready delay=120s timeout=1s period=20s #success=1 #failure=3
Environment:
MINIO_ACCESS_KEY: minio
MINIO_SECRET_KEY: minio123
Mounts:
/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-2vsh9 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: minio-pv-claim
ReadOnly: false
default-token-2vsh9:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-2vsh9
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 5m10s (x7 over 12m) default-scheduler 0/3 nodes are available: 3 node(s) had taints that the pod didn\'t tolerate.
Normal Scheduled 5m8s default-scheduler Successfully assigned default/minio-756cb7dff7-mcm6m to fe001
Normal Pulling 5m7s kubelet, fe001 Pulling image "minio/minio:RELEASE.2019-04-18T21-44-59Z"
Normal Pulled 4m47s kubelet, fe001 Successfully pulled image "minio/minio:RELEASE.2019-04-18T21-44-59Z"
Normal Created 4m47s kubelet, fe001 Created container minio
Normal Started 4m47s kubelet, fe001 Started container minio

  • 如上所示,Pending原因是因为 Warning FailedScheduling 5m10s (x7 over 12m) default-scheduler 0/3 nodes are available: 3 node(s) had taints that the pod didn\'t tolerate.
1
2
3
4
5
6
7
8
9
10
~$ kubectl get pod --namespace default -l app=minio
NAME READY STATUS RESTARTS AGE
minio-756cb7dff7-k2sdk 1/1 Running 0 15m
# 在运行的容器中远程执行命令.下面的双横杠代表着kubectl命令项的结束.
~$ kubectl exec minio-756cb7dff7-k2sdk -- ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=1 ttl=41 time=25.684 ms
# 在容器中运行shell
~$ kubectl exec -it minio-756cb7dff7-k2sdk sh
/ #

通过Ingress暴露服务

  • 向集群外部的客户端公开服务的有使用LoadBalancerIngress两种方法.每个LoadBalancer服务都需要自己的负载均衡器,以及独有的公有IP的地址,而Ingress只需要一个公网IP就能为许多服务提供访问.当客户端向Ingress发送Http请求时,Ingress会根据请求的主机名和路径决定请求转发到的服务.Ingress在网络栈的(HTTP)的应用层操作,并且可以提供一些服务不能实现的功能.如基于cookiesession affinity等功能.

Traefik反向代理

快速安装测试

  • 下载一份Traefik的配置,可以从这里复制一份简单配置,或者去到https://github.com/traefik/traefik/releases下载一个Traefik的二进制的程序,运行如下命令生成它:

    1
    ~$ ./traefik -c traefik.toml
  • 直接Docker运行测试,浏览器打开http://127.0.0.1:8080/dashboard/#/进入控制台查看.它有几个重要的核心组件:

    • Providers
    • Entrypoints
    • Routers
    • Services
    • Middlewares
      1
      ~$ docker run -d -p 8080:8080 -p 80:80 -v $PWD/traefik.toml:/etc/traefik/traefik.toml traefik

docker-compose测试说明

  • 下面是用一个简单的组合配置测试,来直观理解说明Traefik的用途与使用场景,这里使用v2.5的版本,与旧的v1.2是有差别的.测试的目录结构如下:
1
2
3
4
5
6
7
8
9
10
11
12
~$ tree
.
├── minio
│   └── docker-compose.yml
├── traefik
│   ├── docker-compose-v2.yml
│   ├── docker-compose.yml
│   └── traefik.toml
└── whoami-app
├── docker-compose-v2.yml
└── docker-compose.yml

  • 运行Traefik实例

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    traefik$ cat docker-compose-v2.yml
    version: '3'

    services:
    reverse-proxy:
    # The official v2 Traefik docker image
    image: traefik:v2.5
    # Enables the web UI and tells Traefik to listen to docker
    command: --api.insecure=true --providers.docker
    ports:
    # The HTTP port
    - "80:80"
    # The Web UI (enabled by --api.insecure=true)
    - "8080:8080"
    volumes:
    # So that Traefik can listen to the Docker events
    - /var/run/docker.sock:/var/run/docker.sock
    traefik$ docker-compose -f docker-compose-v2.yml up -d
    Creating network "traefik_default" with the default driver
    Creating traefik_reverse-proxy_1 ... done
  • 运行3个测试实例(traefik/whoami)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
whoami-app$ cat docker-compose-v2.yml
version: '3'
services:
whoami:
# A container that exposes an API to show its IP address
image: traefik/whoami
networks:
- traefik_default
labels:
- traefik.docker.network=traefik_default
- traefik.http.routers.whoami.rule=Host(`whoami.docker.localhost`)

networks:
traefik_default:
external: true

whoami-app$ docker-compose -f docker-compose-v2.yml up -d --scale whoami=3
Starting whoami-app_whoami_1 ... done
Creating whoami-app_whoami_2 ... done
Creating whoami-app_whoami_3 ... done
  • 可以通过浏览器的页面http://127.0.0.1:8080/dashboard/#/http/services查看服务的状态,也可以通过如下命令查看.

    1
    2
    3
    4
    5
    ~$ curl  http://localhost:8080/api/rawdata | jq -c '.services'
    % Total % Received % Xferd Average Speed Time Time Time Current
    Dload Upload Total Spent Left Speed
    100 1829 100 1829 0 0 1786k 0 --:--:-- --:--:-- --:--:-- 1786k
    {"api@internal":{"status":"enabled","usedBy":["api@internal"]},"dashboard@internal":{"status":"enabled","usedBy":["dashboard@internal"]},"noop@internal":{"status":"enabled"},"reverse-proxy-traefik@docker":{"loadBalancer":{"servers":[{"url":"http://172.30.0.2:80"}],"passHostHeader":true},"status":"enabled","usedBy":["reverse-proxy-traefik@docker"],"serverStatus":{"http://172.30.0.2:80":"UP"}},"whoami-whoami-app@docker":{"loadBalancer":{"servers":[{"url":"http://172.31.0.4:80"},{"url":"http://172.31.0.2:80"},{"url":"http://172.31.0.3:80"}],"passHostHeader":true},"status":"enabled","usedBy":["whoami@docker"],"serverStatus":{"http://172.31.0.2:80":"UP","http://172.31.0.3:80":"UP","http://172.31.0.4:80":"UP"}}}
  • 或者是这样

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    ~$ curl -H Host:whoami.docker.localhost http://127.0.0.1
    Hostname: 4c9f9a107136
    IP: 127.0.0.1
    IP: 172.19.0.6
    RemoteAddr: 172.19.0.1:46902
    GET / HTTP/1.1
    Host: whoami.docker.localhost
    User-Agent: curl/7.74.0
    Accept: */*
    Accept-Encoding: gzip
    X-Forwarded-For: 172.19.0.1
    X-Forwarded-Host: whoami.docker.localhost
    X-Forwarded-Port: 80
    X-Forwarded-Proto: http
    X-Forwarded-Server: 8a986b075043
    X-Real-Ip: 172.19.0.1

开启证书与Basic Auth

  • 下面是一个复杂的docker-compose,支持Let's Crypto自动申请证书。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    version: "3.3"

    services:
    traefik:
    image: "traefik:v2.5"
    container_name: "traefik"
    command:
    - "--api.insecure=false"
    - "--providers.docker=true"
    - "--providers.docker.exposedbydefault=false"
    - "--providers.file.directory=/letsencrypt/"
    - "--entrypoints.web.address=:80"
    - "--entrypoints.websecure.address=:443"
    - "--certificatesresolvers.myresolver.acme.httpchallenge=true"
    - "--certificatesresolvers.myresolver.acme.httpchallenge.entrypoint=web"
    #- "--certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
    - "--certificatesresolvers.myresolver.acme.email=<your email>@gmail.com"
    - "--certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json"
    # Global HTTP -> HTTPS
    - "--entrypoints.web.http.redirections.entryPoint.to=websecure"
    - "--entrypoints.web.http.redirections.entryPoint.scheme=https"
    # Enable dashboard
    - "--api.dashboard=true"
    ports:
    - "80:80"
    - "443:443"
    - "8080:8080"
    volumes:
    - "./certs:/letsencrypt"
    - "/var/run/docker.sock:/var/run/docker.sock:ro"
    labels:
    - "traefik.enable=true"
    - "traefik.http.services.api@internal.loadbalancer.server.port=8080" # required by swarm but not used.
    - "traefik.http.routers.traefik.rule=Host(`<Your FQDN domain name>`) && (PathPrefix(`/dashboard`) || PathPrefix(`/api`))"
    - "traefik.http.routers.traefik.middlewares=traefik-https-redirect"
    - "traefik.http.routers.traefik.entrypoints=websecure"
    - "traefik.http.routers.traefik.tls.certresolver=myresolver"
    - "traefik.http.routers.traefik.tls=true"
    - "traefik.http.routers.traefik.tls.options=default"
    - "traefik.http.middlewares.traefik-https-redirect.redirectscheme.scheme=https"
    - "traefik.http.routers.traefik.middlewares=traefik-auth"
    - "traefik.http.middlewares.traefik-auth.basicauth.users=<login name>:$$apr1$$Pf2MP/Oy$$...."
    - "traefik.http.routers.traefik.service=api@internal"
    #- 'traefik.http.routers.traefik.middlewares=strip'
    #- 'traefik.http.middlewares.strip.stripprefix.prefixes=/dashboard'

    whoami:
    image: "traefik/whoami"
    container_name: "simple-service"
    labels:
    - "traefik.enable=true"
    - "traefik.http.routers.whoami.rule=Host(`<Your FQDN domain name>`) && Path(`/whoami`)"
    - "traefik.http.routers.whoami.entrypoints=websecure"
    - "traefik.http.routers.whoami.tls.certresolver=myresolver"

  • 上面的申请的证书是通过httpchallenge,也可以通过dns验证的方式, 并且开启了Traefik DashboardTLS+Basic Auth认证. 上面的whoami的服务,是用做示例说明,就此说明可以使用Host() && Path(/whoami)方式,去反向代理很多不同的内部服务。

  • 如上面所示,使用了本地./certs:/letsencrypt挂载,所以certs目录内容如下。acme.json是证书相关的内容。

1
2
3
4
~$ tree ./certs/
./certs/
├── acme.json
└── tls.yml
  • 创建tls.yml是为了加强TLS的设置级别,内容如下.可以通过https://www.ssllabs.com去测试当前服务器的TLS安全评分,如果能得到A+说明非常好.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    ~$ cat certs/tls.yml
    tls:
    options:
    default:
    minVersion: "VersionTLS13"
    sniStrict: true
    cipherSuites:
    - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
    - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
    - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
    - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
    - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
  • 创建Basic Auth的用户与密码,一般都使用apache-utils工具里的htpasswd命令,也可以使用下面的方式创建,$需要转义成$$.

1
2
3
4
5
6
7
# -apr1  uses the apr1 algorithm (Apache variant of the BSD algorithm).

~$ openssl passwd -apr1
Password:
Verifying - Password:
$apr1$AzG7Y5HE$dZoKWVCmxffAe1oakeHR40

  • 或者这样
1
2
3
4
5
~$ printf "<Your User>:$(openssl passwd -apr1 <Your password> | sed -E "s:[\$]:\$\$:g")\n"  >> ~/.htpasswd

~$ printf "admin:$(openssl passwd -apr1 admin | sed -E "s:[\$]:\$\$:g")\n"
admin:$$apr1$$7eSlrnJD$$XGLpWARS.YLxwYPoRtUdc.

docker-compose安装MinIO

  • 运行一个稍微复杂的MinIO的服务实例

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    version: "3"

    services:
    minio:
    # Please use fixed versions :D
    image: minio/minio
    hostname: minio
    networks:
    - traefik_default
    volumes:
    - $PWD/minio-data:/data
    command:
    - server
    - /data
    - --console-address
    - ":9001"
    expose:
    - 9000
    - 9001
    environment:
    - MINIO_ROOT_USER=minio
    - MINIO_ROOT_PASSWORD=minio123
    - APP_NAME=minio
    # Do NOT use MINIO_DOMAIN or MINIO_SERVER_URL with Traefik.
    # All Routing is done by Traefik, just tell minio where to redirect to.
    - MINIO_BROWSER_REDIRECT_URL=http://minio-console.localhost
    labels:
    - traefik.enable=true
    - traefik.docker.network=traefik_default
    - traefik.http.routers.minio.rule=Host(`minio.localhost`)
    - traefik.http.routers.minio-console.rule=Host(`minio-console.localhost`)
    - traefik.http.routers.minio.service=minio
    - traefik.http.services.minio.loadbalancer.server.port=9000
    - traefik.http.services.minio-console.loadbalancer.server.port=9001
    - traefik.http.routers.minio-console.service=minio-console

    networks:
    traefik_default:
    external: true

    minio$ docker-compose up -d
    Creating minio_minio_1 ... done
  • 运行上面实例后,可以通浏览器打开http://minio-console.localhost,输入minio:minio123登录到minIO的控制台页面.为什么会通过minio-console.localhost这样一个域名,就可以反向代理内部的服务了.这里可以打开http://localhost:8080/dashboard/#/http/routers看到相应的路由,也可以用下面命令查看有那些路由:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    ~$ curl http://localhost:8080/api/rawdata | jq   '.routers[] | .service'
    % Total % Received % Xferd Average Speed Time Time Time Current
    Dload Upload Total Spent Left Speed
    100 2509 0 2509 0 0 2450k 0 --:--:-- --:--:-- --:--:-- 2450k
    "api@internal"
    "dashboard@internal"
    "minio-console"
    "minio"
    "reverse-proxy-traefik"
    "whoami-whoami-app"
  • 这里测试时需要注意一个点,如果把miniotraefik等服务描述,写在同一个docker-compose.yml文件里,是不需要
    指定networks段的,如果分开写的,是需要声明与traefik的服务在同一个网络域内,上面的测试实例中,
    docker-compose启动traefik服务时,会默认创建一个名为traefik_default网络域.所以这里在上述
    实例中:whoami,minio中,都声明定义了networks的字段,本机Docker网络列表如下:

1
2
3
4
5
6
7
~$ docker network ls
NETWORK ID NAME DRIVER SCOPE
f8c2befaa42f bridge bridge local
f27beded9896 host host local
244ab53cbc48 none null local
18ddc4985478 traefik_default bridge local
1c5f5a863ef9 whoami-app_default bridge local
  • Traefik创建路由规则有多种方式,比如:

    • 原生Ingress写法
    • 使用CRD IngressRoute方式
    • 使用GatewayAPI的方式
  • 停掉测试实例

    1
    2
    3
    4
    5
    6
    7
    8
    whoami-app$ docker-compose -f docker-compose-v2.yml down
    Stopping whoami-app_whoami_1 ... done
    Stopping whoami-app_whoami_3 ... done
    Stopping whoami-app_whoami_2 ... done
    Removing whoami-app_whoami_1 ... done
    Removing whoami-app_whoami_3 ... done
    Removing whoami-app_whoami_2 ... done
    Removing network whoami-app_default

MC客户端

  • MinIO Client Complete Guide Slack

    1
    2
    3
    wget https://dl.min.io/client/mc/release/linux-amd64/mc
    chmod +x mc
    ./mc --help
  • 配置一个S3服务端

    1
    2
    3
    4
    5
    6
    7
    ./mc config host add mystorage http://minio.localhost test1234access test1234secret --api s3v4
    mc: Configuration written to `/home/michael/.mc/config.json`. Please update your access credentials.
    mc: Successfully created `/home/michael/.mc/share`.
    mc: Initialized share uploads `/home/michael/.mc/share/uploads.json` file.
    mc: Initialized share downloads `/home/michael/.mc/share/downloads.json` file.
    Added `mystorage` successfully.

  • 查看信息

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    ~$ ./mc admin info play/
    ● play.min.io
    Uptime: 2 days
    Version: 2021-12-10T23:03:39Z
    Network: 1/1 OK
    Drives: 4/4 OK

    11 GiB Used, 392 Buckets, 8,989 Objects
    4 drives online, 0 drives offline

S3客户端连接测试MinIO

1
~$ pip3 install awscli
1
2
3
4
5
minio$ aws configure --profile minio
AWS Access Key ID [None]: test1234minio
AWS Secret Access Key [None]: test1234minio
Default region name [None]: minio-lan
Default output format [None]: json

Helm包管理器

  • Helmk8s的包管理器,它可以类比为Debian,Ubuntuapt.Red Hatyum,Python中的pip.Nodejsnpm包管理器.Helm可以理解为Kubernetes的包管理工具,可以方便地发现、共享和使用为Kubernetes构建的应用,它包含几个基本概念:
    • .Chart:一个Helm包,其中包含了运行一个应用所需要的镜像、依赖和资源定义等,还可能包含Kubernetes集群中的服务定义,类似Homebrew中的formula,APTdpkg或者Yumrpm文件.
    • .Release:在Kubernetes集群上运行的Chart的一个实例.在同一个集群上,一个Chart可以安装很多次.每次安装都会创建一个新的Release. MySQL Chart,如果想在服务器上运行两个数据库,就可以把这个Chart安装两次.每次安装都会生成自己的Release,会有自己的Release名称.
    • .Repository:用于发布和存储Chart的仓库.
1
2
3
4
5
6
7
8
9
# 下载一个最版本,把它解压到/usr/local/bin
~$ wget -c https://storage.googleapis.com/kubernetes-helm/helm-v2.13.1-linux-amd64.tar.gz

~$ helm version
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Error: could not find tiller

~$ helm completion bash > ~/.helmrc
echo "source ~/.helmrc" >> ~/.bashrc

安装Tiller服务器

  • 利用Helm简化Kubernetes应用部署
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    ~$ helm init
    Creating /home/lcy/.helm
    Creating /home/lcy/.helm/repository
    Creating /home/lcy/.helm/repository/cache
    Creating /home/lcy/.helm/repository/local
    Creating /home/lcy/.helm/plugins
    Creating /home/lcy/.helm/starters
    Creating /home/lcy/.helm/cache/archive
    Creating /home/lcy/.helm/repository/repositories.yaml
    Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com
    Error: Looks like "https://kubernetes-charts.storage.googleapis.com" is not a valid chart repository or cannot be reached: read tcp 172.18.127.186:54980->216.58.199.16:443: read: connection reset by peer

  • 如上所示,无法连接到官方的服务器,国内利用阿里云源来安装
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    ~$ helm init --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.13.1 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
    Creating /home/lcy/.helm/repository/repositories.yaml
    Adding stable repo with URL: https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
    Adding local repo with URL: http://127.0.0.1:8879/charts
    $HELM_HOME has been configured at /home/lcy/.helm.

    Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

    Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
    To prevent this, run `helm init` with the --tiller-tls-verify flag.
    For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
    Happy Helming!

    ~$ helm init --upgrade
    $HELM_HOME has been configured at /home/lcy/.helm.

    Tiller (the Helm server-side component) has been upgraded to the current version.
    Happy Helming!
    # 查找charts
    ~$ helm search
    NAME CHART VERSION APP VERSION DESCRIPTION
    stable/acs-engine-autoscaler 2.1.3 2.1.1 Scales worker nodes within agent pools
    stable/aerospike 0.1.7 v3.14.1.2 A Helm chart for Aerospike in Kubernetes
    stable/anchore-engine 0.1.3 0.1.6 Anchore container analysis and policy evaluation engine s...
    stable/artifactory 7.0.3 5.8.4 Universal Repository Manager supporting all major packagi...
    stable/artifactory-ha 0.1.0 5.8.4 Universal Repository Manager supporting all major packagi...
    [...]
    # 更新仓库
    ~$ helm repo update
    Hang tight while we grab the latest from your chart repositories...
    ...Skip local chart repository
    ...Successfully got an update from the "stable" chart repository
    Update Complete. ⎈ Happy Helming!⎈

    # 仓库地址
    ~$ helm repo list
    NAME URL
    stable https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
    local http://127.0.0.1:8879/charts

使用Helm Chart部署MinIO

  • MinIO中文手册
  • 默认standaline模式下,需要开启Beta APIKubernetes 1.4+.如果没有出错就会运行成功,如下所示.
    如果出错,按照后面的错误处理.
    accessKey默认access key AKIAIOSFODNN7EXAMPLE
    secretKey默认secret key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
~$ helm install stable/minio
NAME: snug-elk
LAST DEPLOYED: Mon Apr 15 14:07:10 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
snug-elk-minio-config-cm 2 0s

==> v1/PersistentVolumeClaim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
snug-elk-minio Pending 0s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
snug-elk-minio-7b9878bb66-mmx9n 0/1 Pending 0 0s

==> v1/Secret
NAME TYPE DATA AGE
snug-elk-minio-user Opaque 2 0s

==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
snug-elk-minio-svc LoadBalancer 10.103.48.186 <pending> 9000:32076/TCP 0s

==> v1beta1/Deployment
NAME READY UP-TO-DATE AVAILABLE AGE
snug-elk-minio 0/1 1 0 0s


NOTES:

Minio can be accessed via port 9000 on an external IP address. Get the service external IP address by:
kubectl get svc --namespace default -l app=snug-elk-minio

Note that the public IP may take a couple of minutes to be available.

You can now access Minio server on http://<External-IP>:9000. Follow the below steps to connect to Minio server with mc client:

1. Download the Minio mc client - https://docs.minio.io/docs/minio-client-quickstart-guide

2. mc config host add snug-elk-minio-local http://<External-IP>:9000 AKIAIOSFODNN7EXAMPLE wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY S3v4

3. mc ls snug-elk-minio-local

Alternately, you can use your browser or the Minio SDK to access the server - https://docs.minio.io/categories/17
  • 查看Release对象
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
~$ kubectl get service snug-elk-minio-svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
snug-elk-minio-svc LoadBalancer 10.103.48.186 <pending> 9000:32076/TCP 3m7s

~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
store-minio-75bb89c596-74nz9 0/1 Pending 0 17m

~$ kubectl describe pod store-minio-75bb89c596-74nz9
Name: store-minio-75bb89c596-74nz9
[...]

~$ helm list
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
snug-elk 1 Mon Apr 15 14:07:10 2019 DEPLOYED minio-0.5.5 default

安装MinIO客户端

  • 按照上面安装服务器的NOTES,安装与配置它的命令行客户端.
    1
    2
    3
    4
    5
    ~$ wget https://dl.minio.io/client/mc/release/linux-amd64/mc
    ~$ sudo mv mc /usr/local/bin && chmod +x /usr/local/bin/mc
    ~$ kubectl get svc --namespace default -l app=snug-elk-minio
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    snug-elk-minio-svc LoadBalancer 10.103.48.186 <pending> 9000:32076/TCP 25h

安装集群监控DashBoard

1
~$ kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
  • 查看安装是否成功.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
~$ kubectl  --namespace=kube-system get pod -l k8s-app=kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-5f7b999d65-m2zmt 0/1 ImagePullBackOff 0 10m

~$ kubectl --namespace=kube-system describe pod -l k8s-app=kubernetes-dashboard | grep "Events" -A +10
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned kube-system/kubernetes-dashboard-5f7b999d65-m2zmt to dig001
Normal Pulling 11m (x4 over 14m) kubelet, dig001 Pulling image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1"
Warning Failed 11m (x4 over 13m) kubelet, dig001 Failed to pull image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1": rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 11m (x4 over 13m) kubelet, dig001 Error: ErrImagePull
Warning Failed 11m (x6 over 13m) kubelet, dig001 Error: ImagePullBackOff
Normal BackOff 4m5s (x36 over 13m) kubelet, dig001 Back-off pulling image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1"

  • kubectl -n kube-system delete $(kubectl -n kube-system get pod -o name | grep dashboard)下面修改它的镜像地址,替换成国内的阿里云的地址.
  • 使用kubectl edit deployment/kubernetes-dashboard -n kube-system打开编辑,把k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1替换成registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1,保存退出.
1
2
3
4
5
6
7
8
9
10
# 修改后,安装成了.
~$ kubectl --namespace=kube-system get deployment -l k8s-app=kubernetes-dashboard
NAME READY UP-TO-DATE AVAILABLE AGE
kubernetes-dashboard 1/1 1 1 51m
~$ kubectl --namespace=kube-system get pod -l k8s-app=kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-5d9599dc98-gj8w7 1/1 Running 0 99s
~$ kubectl --namespace=kube-system get svc -l k8s-app=kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard ClusterIP 10.107.85.75 <none> 443/TCP 49m

ProxySSH转发访问方式

1
2
3
4
5
# k8s的服务器上
~$ kubectl proxy
Starting to serve on 127.0.0.1:8001
# 控制主机上运行下面命令,再从浏览器里打开http://127.0.0.1:8001/就能访问到DashBoard
~$ ssh 8001:localhost:8001 <user@k8s-server> -Nf

修改Service端口类型

  • 通过kubectl --namespace=kube-system edit svc kubernetes-dashboard打开编辑,把type: ClusterIP替换成type: NodePort.
    1
    2
    3
    4
    ~$ kubectl --namespace=kube-system edit svc kubernetes-dashboard
    ~$ kubectl --namespace=kube-system get svc kubernetes-dashboard
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    kubernetes-dashboard NodePort 10.107.85.75 <none> 443:31690/TCP 65m
  • 如上所示可以通https://<service-host>:31690/访问到DashBoard.

访问认证

创建服务帐号

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
~$ cat admin-user.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kube-system
~$ kubectl create -f admin-user.yaml

~$ kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin | awk '{print $1}')
Name: admin-user-token-7dj2c
Namespace: kube-system
Labels: <none>
Annotations: kubernetes.io/service-account.name: admin-user
kubernetes.io/service-account.uid: a65538aa-673b-11e9-b8a2-00163e027e39

Type: kubernetes.io/service-account-token

Data
====
ca.crt: 1025 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLTdkajJjIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhNjU1MzhhYS02NzNiLTExZTktYjhhMi0wMDE2M2UwMjdlMzkiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.eW_YSgn_kTIQDcbB51k8HaY9LABeFg5mFFPJykYgsoyxZH_b80WEcDZn4Z4Ix2BJvhK1sBESfSa_Qn1yN5pcIzUMROIYvBGZBSnMmw2VsSpQMUTJ1ha43ql-GKCz15ro1VrhyeWeCtiVTILA0Z0DwfgO2skjY2x1KO_76sDR7r66frZDjGmgYTm-b3E6RdcETB41Wjjuj-nt3b3ZblkBr3QDKP-tlvnW_nr7LcmgF7etjU8qK_W3fj-LB_BnWRpRiamQeXLNJuC-Dq42x00gAzQuVg17rDcEiKxJWmDYYsojvm7Xg0fSwXLCdBfgysYCz5PMR05dT0QU0iYO7z_Cow
  • 打开DashBoard首页,填入上面的token值.

创建集群角色绑定

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
~$ cat dashboard-admin.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system

~$ kubectl create -f dashboard-admin.yaml

~$ kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep "kubernetes-dashboard-token" |awk '{print $1}')
Name: kubernetes-dashboard-token-ftk96
Namespace: kube-system
Labels: <none>
Annotations: kubernetes.io/service-account.name: kubernetes-dashboard
kubernetes.io/service-account.uid: ba35db4d-672d-11e9-b8a2-00163e027e39

Type: kubernetes.io/service-account-token

Data
====
ca.crt: 1025 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi1mdGs5NiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImJhMzVkYjRkLTY3MmQtMTFlOS1iOGEyLTAwMTYzZTAyN2UzOSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.gnspFBWcou3-EgMsSPPQqSt1fwWne6tLCNgHa0yrQLEH9DDRgDQh1mBDne4Z2M-s3FPlTV9DI77QneanA12jHrpLHRohQSlyiz8Pv3xa7JRb7Hfyj5PhbSlX2KtTbOlVvAdlFttFi3vw-fbUJWcALEmogwa7jnlR233slJLjZ8nAA9xsE-gr4_zYmZ2VhYGfH0dAs2H2aCklRl2Sy5VQpoDlGjKH82-FcCrLwGQyLpAA9tr0H7pivGIFqO46PWR0aBLiT1BBkmjoQJkDPy0qRxi90nG1WyFnDLHYK6BRDTZ4G-J3QhAiAK0su-7i6rJhMKm-FbnYXULIstW1LyO4tg

更新Dashboard

  • 安装Dashboard它不会自动更新复盖本地旧的版本,必须要手动清除的旧的版本再安装新的版本.
    1
    ~$ kubectl -n kube-system delete $(kubectl -n kube-system get pod -o name | grep dashboard)

官方快速入门实例教程

Guestbook实例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
~$ git clone https://github.com/kubernetes/examples
~$ cd exmaples
~$ tree -L 1
.
├── cassandra
├── code-of-conduct.md
├── CONTRIBUTING.md
├── guestbook
├── guestbook-go
├── guidelines.md
├── LICENSE
├── mysql-wordpress-pd
├── OWNERS
├── README.md
├── SECURITY_CONTACTS
└── staging

5 directories, 7 files

安装Redis Master Pod

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
~$ cd examples/guestbook$ && ls *.yaml
frontend-deployment.yaml redis-master-deployment.yaml redis-slave-deployment.yaml
frontend-service.yaml redis-master-service.yaml redis-slave-service.yaml
~$ cat redis-master-deployment.yaml
apiVersion: apps/v1 # for k8s versions before 1.9.0 use apps/v1beta2 and before 1.8.0 use extensions/v1beta1
kind: Deployment
metadata:
name: redis-master
spec:
selector:
matchLabels:
app: redis
role: master
tier: backend
replicas: 1
template:
metadata:
labels:
app: redis
role: master
tier: backend
spec:
containers:
- name: master
# image: k8s.gcr.io/redis:e2e # or just image: redis
image: forestgun007/redis:e2e
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 6379
  • 注意上面的k8.gcr.io在国内是不能访问的,所以要修改redis-master-deployment.yaml里的image,
    这里通过docker search redis:e2e查询到一些mirror的镜像文件.
1
2
3
4
5
6
7
8
9
~$ docker search redis:e2e
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
forestgun007/redis gcr.io/google_containers/redis:e2e 1 [OK]
will835559313/gcr_redis gcr.io/google_containers/redis:e2e 0 [OK]
smallguitar/redis-master gcr.io/google_containers/redis:e2e 0 [OK]

~$ kubectl apply -f redis-master-deployment.yaml
~$ kubectl get pods
redis-slave-555b8847c4-mttt9 1/1 Running 0 16h

安装Redis Master Service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
~$ cat redis-master-service.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-master
labels:
app: redis
role: master
tier: backend
spec:
ports:
- port: 6379
targetPort: 6379
selector:
app: redis
role: master
tier: backend
~$ kubectl apply -f redis-master-service.yaml
~$ kubectl get service redis-master
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
redis-master ClusterIP 10.106.23.86 <none> 6379/TCP 17h

安装Redis Slave Pod

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
$ cat redis-slave-deployment.yaml
apiVersion: apps/v1 # for k8s versions before 1.9.0 use apps/v1beta2 and before 1.8.0 use extensions/v1beta1
kind: Deployment
metadata:
name: redis-slave
spec:
selector:
matchLabels:
app: redis
role: slave
tier: backend
replicas: 2
template:
metadata:
labels:
app: redis
role: slave
tier: backend
spec:
containers:
- name: slave
# gcr.io/google_samples/gb-redisslave:v1 同样需要改成下面这个镜像
image: forestgun007/gb-redisslave:v1
resources:
requests:
cpu: 100m
memory: 100Mi
env:
- name: GET_HOSTS_FROM
value: dns
# If your cluster config does not include a dns service, then to
# instead access an environment variable to find the master
# service's host, comment out the 'value: dns' line above, and
# uncomment the line below:
# value: env
ports:
- containerPort: 6379

~$ kubectl apply -f redis-slave-deployment.yaml
~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
redis-slave-555b8847c4-mttt9 1/1 Running 0 17h
redis-slave-555b8847c4-r24xx 1/1 Running 0 17h

安装Redis Slave Service

1
2
3
4
5
~$ kubectl apply -f  redis-slave-service.yaml

~$ kubectl get svc redis-slave
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
redis-slave ClusterIP 10.103.39.53 <none> 6379/TCP 17h

安装Frontend Pod

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
~$ cat frontend-deployment.yaml
apiVersion: apps/v1 # for k8s versions before 1.9.0 use apps/v1beta2 and before 1.8.0 use extensions/v1beta1
kind: Deployment
metadata:
name: frontend
spec:
selector:
matchLabels:
app: guestbook
tier: frontend
replicas: 3
template:
metadata:
labels:
app: guestbook
tier: frontend
spec:
containers:
- name: php-redis
# image: gcr.io/google-samples/gb-frontend:v4
image: forestgun007/google-samples-gb-frontend:v4
resources:
requests:
cpu: 100m
memory: 100Mi
env:
- name: GET_HOSTS_FROM
value: dns
# If your cluster config does not include a dns service, then to
# instead access environment variables to find service host
# info, comment out the 'value: dns' line above, and uncomment the
# line below:
# value: env
ports:
- containerPort: 80

~$ kubectl apply -f frontend-deployment.yaml

~$ kubectl get pod
NAME READY STATUS RESTARTS AGE
frontend-6f4cc58c94-2wv5l 1/1 Running 0 17h
frontend-6f4cc58c94-s6s8l 1/1 Running 0 17h
frontend-6f4cc58c94-z9qmk 1/1 Running 0 17h

安装Frontend Service

1
2
3
4
5
~$ kubectl apply -f  frontend-service.yaml
~$ kubectl get service frontend
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
frontend NodePort 10.106.20.24 <none> 80:30577/TCP 17h

错误处理

连接证书错误

1
2
~$ kubectl get node
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
  • 上述错误,一般是在kubeadm reset后,没有更新~/.kube/config的文件发生.cp /etc/kubernetes/admin.conf ~/.kube/conf就可以解决.

加入节点错误

1
2
3
4
~$ sudo kubeadm join 172.18.127.186:6443 --token z8r97j.3ovdfddb6df9lnq7     --discovery-token-ca-cert-hash sha256:07767a67fa6c38feda7471ee5e1a15a0a9c417cfdf6cf457ff577297f22d9415
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: couldn't validate the identity of the API Server: abort connecting to API servers after timeout of 5m0s
  • 如果加入节点时间很长,且最后还出错了,加参数-v=6,就会出现如下错误:
    1
    2
    3
    4
    5
    6
    7
    8
    ~$ sudo kubeadm join 172.18.127.186:6443 --token z8r97j.3ovdfddb6df9lnq7     --discovery-token-ca-cert-hash sha256:07767a67fa6c38feda7471ee5e1a15a0a9c417cfdf6cf457ff577297f22d9415 -v=6

    [...]
    I0414 16:23:00.398178 13202 token.go:200] [discovery] Trying to connect to API Server "172.18.127.186:6443"
    I0414 16:23:00.398724 13202 token.go:75] [discovery] Created cluster-info discovery client, requesting info from "https://172.18.127.186:6443"
    I0414 16:23:00.402234 13202 round_trippers.go:438] GET https://172.18.127.186:6443/api/v1/namespaces/kube-public/configmaps/cluster-info 200 OK in 3 milliseconds
    I0414 16:23:00.402426 13202 token.go:203] [discovery] Failed to connect to API Server "172.18.127.186:6443": token id "z8r97j" is invalid for this cluster or it has expired. Use "kubeadm token create" on the control-plane node to create a new valid token
    [...]
  • 如果出现上述错误,在Master节点输入下面命令,用它输出的kubeadm join参数,在需要添加的节点为上运行.
    1
    2
    3
    ~$ kubeadm  token create --print-join-command
    # 使用下述输出,重新添加.
    kubeadm join 172.18.127.186:6443 --token in1l6v.ue78pr5vvr55qcad --discovery-token-ca-cert-hash sha256:a1f80db7a76e214dd529fc2aed660d71428994d9104c1b320bf5abb6cda4b165

安装charts错误

1
2
~$ helm install stable/minio
Error: could not find a ready tiller pod
  • 第一步,更新一下仓库
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    ~$ helm init --upgrade
    $HELM_HOME has been configured at /home/lcy/.helm.

    Tiller (the Helm server-side component) has been upgraded to the current version.
    Happy Helming!

    ~$ helm repo list
    NAME URL
    stable https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
    local http://127.0.0.1:8879/charts
  • 查看k8s的子系统Pod的状态,查看tiller-deploy部分.
    1
    2
    3
    4
    5
    6
    ~$ kubectl  -n kube-system get po
    NAME READY STATUS RESTARTS AGE
    calico-kube-controllers-5cbcccc885-krbzj 1/1 Running 0 17h
    []...]
    kube-scheduler-k8s-master 1/1 Running 0 18h
    tiller-deploy-c48485567-m7kj2 0/1 ErrImagePull 0 2m50s
  • 根据上述输出,tiller-deploy拉取镜像失败,没有运行起,下面再查看详情.
1
2
3
4
5
6
7
8
9
10
11
12
13
~$ kubectl  describe pod tiller-deploy-c48485567-m7kj2   -n kube-system
Name: tiller-deploy-c48485567-m7kj2
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 16m default-scheduler Successfully assigned kube-system/tiller-deploy-c48485567-m7kj2 to dig001
Normal Pulling 14m (x4 over 16m) kubelet, dig001 Pulling image "gcr.io/kubernetes-helm/tiller:v2.13.1"
Warning Failed 13m (x4 over 16m) kubelet, dig001 Failed to pull image "gcr.io/kubernetes-helm/tiller:v2.13.1": rpc error: code = Unknown desc = Error response from daemon: Get https://gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 13m (x4 over 16m) kubelet, dig001 Error: ErrImagePull
Warning Failed 13m (x7 over 16m) kubelet, dig001 Error: ImagePullBackOff
Normal BackOff 82s (x57 over 16m) kubelet, dig001 Back-off pulling image "gcr.io/kubernetes-helm/tiller:v2.13.1"

  • 因为使用要gcr.io的仓库造成拉取失败,下面通过docker search tiller | grep "Mirror"选取一个,再通过下面命令修改它.
    1
    2
    ~$ kubectl edit deploy tiller-deploy -n kube-system
    [....]
  • 将上述中的image gcr.io/kubernetes-helm/tiller:v2.13.1 替换成image: sapcc/tiller:v2.13.1,下面再运行就会显示tiller成功运行.
1
2
3
4
5
~$ kubectl get pod -n kube-system | grep "tiller"
tiller-deploy-b7bd9495c-bf777 1/1 Running 0 2m57s
~$ helm version
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}

Error: no available release name found

1
2
3
4
5
~$ kubectl create serviceaccount --namespace kube-system tiller
~$ kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
clusterrolebinding.rbac.authorization.k8s.io/tiller-cluster-rule created
~$ kubectl patch deploy --namespace kube-system tiller-deploy -p \'{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}\'
deployment.extensions/tiller-deploy patched

kubeadm初始化错误

1
2
3
4
5
6
7
8
9
~$ sudo kubeadm init --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers  --kubernetes-version v1.14.1 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.14.1
[preflight] Running pre-flight checks
[preflight] WARNING: Couldn\'t create the interface used for talking to the container runtime: docker is required for container runtime: exec: "docker": executable file not found in $PATH
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
  • ERROR NumCPU加入运行参数--ignore-preflight-errors=NumCPU就变成警告了.

  • FileContent--proc-sys-net-bridge-bridge-nf-call-iptables错误处理如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    ~$ apt-get install bridge-utils
    # 有可能要重启
    ~$ modprobe bridge
    ~$ modprobe br_netfilter

    ~$ cat <<EOF > /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.ipv4.ip_forward=1
    EOF

    ~$ sysctl --system
  • kube-proxyiptables的问题

    1
    2
    ~$ kubectl -n kube-system logs  kube-proxy-xxx
    W0514 00:21:27.445425 1 server_others.go:267] Flag proxy-mode="" unknown, assuming iptables proxy
1
2
3
4
5
6
7
8
9
10
cat mysql-pass.yaml

apiVersion: v1
kind: Secret
metadata:
name: mysql-pass
type: Opaque
data:
username: cm9vdA==
password: cGFzczEyMw==

谢谢支持

  • 微信二维码:

开板板简介

管脚定义

  • 下面数据来源于DE0_Nano_User_Manual.pdfChapter 3-3.5 Expansion headers.

GPIO-0 Pin Assignments

Signal Name FPGA Pin No. Descritption I/O Standard
GPIO_0_IN0 PIN_A8 GPIO Connection DATA 3.3V
GPIO_00 PIN_D3 GPIO Connection DATA 3.3V
GPIO_0_IN1 PIN_B8 GPIO Connection DATA 3.3V
GPIO_01 PIN_C3 GPIO Connection DATA 3.3V
GPIO_02 PIN_A2 GPIO Connection DATA 3.3V
GPIO_03 PIN_A3 GPIO Connection DATA 3.3V
GPIO_04 PIN_B3 GPIO Connection DATA 3.3V
GPIO_05 PIN_B4 GPIO Connection DATA 3.3V
GPIO_06 PIN_A4 GPIO Connection DATA 3.3V
GPIO_07 PIN_B5 GPIO Connection DATA 3.3V
GPIO_08 PIN_A5 GPIO Connection DATA 3.3V
GPIO_09 PIN_D5 GPIO Connection DATA 3.3V
GPIO_010 PIN_B6 GPIO Connection DATA 3.3V
GPIO_011 PIN_A6 GPIO Connection DATA 3.3V
GPIO_012 PIN_B7 GPIO Connection DATA 3.3V
GPIO_013 PIN_D6 GPIO Connection DATA 3.3V
GPIO_014 PIN_A7 GPIO Connection DATA 3.3V
GPIO_015 PIN_C6 GPIO Connection DATA 3.3V
GPIO_016 PIN_C8 GPIO Connection DATA 3.3V
GPIO_017 PIN_E6 GPIO Connection DATA 3.3V
GPIO_018 PIN_E7 GPIO Connection DATA 3.3V
GPIO_019 PIN_D8 GPIO Connection DATA 3.3V
GPIO_020 PIN_E8 GPIO Connection DATA 3.3V
GPIO_021 PIN_F8 GPIO Connection DATA 3.3V
GPIO_022 PIN_F9 GPIO Connection DATA 3.3V
GPIO_023 PIN_E9 GPIO Connection DATA 3.3V
GPIO_024 PIN_C9 GPIO Connection DATA 3.3V
GPIO_025 PIN_D9 GPIO Connection DATA 3.3V
GPIO_026 PIN_E11 GPIO Connection DATA 3.3V
GPIO_027 PIN_E10 GPIO Connection DATA 3.3V
GPIO_028 PIN_C11 GPIO Connection DATA 3.3V
GPIO_029 PIN_B11 GPIO Connection DATA 3.3V
GPIO_030 PIN_A12 GPIO Connection DATA 3.3V
GPIO_031 PIN_D11 GPIO Connection DATA 3.3V
GPIO_032 PIN_D12 GPIO Connection DATA 3.3V
GPIO_033 PIN_B12 GPIO Connection DATA 3.3V

GPIO-1 Pin Assignments

Signal Name FPGA Pin No. Descritption I/O Standard
GPIO_1_IN0 PIN_T9 GPIO Connection DATA 3.3V
GPIO_10 PIN_F13 GPIO Connection DATA 3.3V
GPIO_1_IN1 PIN_R9 GPIO Connection DATA 3.3V
GPIO_11 PIN_T15 GPIO Connection DATA 3.3V
GPIO_12 PIN_T14 GPIO Connection DATA 3.3V
GPIO_13 PIN_T13 GPIO Connection DATA 3.3V
GPIO_14 PIN_R13 GPIO Connection DATA 3.3V
GPIO_15 PIN_T12 GPIO Connection DATA 3.3V
GPIO_16 PIN_R12 GPIO Connection DATA 3.3V
GPIO_17 PIN_T11 GPIO Connection DATA 3.3V
GPIO_18 PIN_T10 GPIO Connection DATA 3.3V
GPIO_19 PIN_R11 GPIO Connection DATA 3.3V
GPIO_110 PIN_P11 GPIO Connection DATA 3.3V
GPIO_111 PIN_R10 GPIO Connection DATA 3.3V
GPIO_112 PIN_N12 GPIO Connection DATA 3.3V
GPIO_113 PIN_P9 GPIO Connection DATA 3.3V
GPIO_114 PIN_N9 GPIO Connection DATA 3.3V
GPIO_115 PIN_N11 GPIO Connection DATA 3.3V
GPIO_116 PIN_L16 GPIO Connection DATA 3.3V
GPIO_117 PIN_K16 GPIO Connection DATA 3.3V
GPIO_118 PIN_R16 GPIO Connection DATA 3.3V
GPIO_119 PIN_L15 GPIO Connection DATA 3.3V
GPIO_120 PIN_P15 GPIO Connection DATA 3.3V
GPIO_121 PIN_P16 GPIO Connection DATA 3.3V
GPIO_122 PIN_R14 GPIO Connection DATA 3.3V
GPIO_123 PIN_N16 GPIO Connection DATA 3.3V
GPIO_124 PIN_N15 GPIO Connection DATA 3.3V
GPIO_125 PIN_P14 GPIO Connection DATA 3.3V
GPIO_126 PIN_L14 GPIO Connection DATA 3.3V
GPIO_127 PIN_N14 GPIO Connection DATA 3.3V
GPIO_128 PIN_M10 GPIO Connection DATA 3.3V
GPIO_129 PIN_L13 GPIO Connection DATA 3.3V
GPIO_130 PIN_J16 GPIO Connection DATA 3.3V
GPIO_131 PIN_K15 GPIO Connection DATA 3.3V
GPIO_132 PIN_J13 GPIO Connection DATA 3.3V
GPIO_133 PIN_J14 GPIO Connection DATA 3.3V

Table 3-8 Pin Assignments for 2x13 Header

Signal Name FPGA Pin No. Descritption I/O Standard
GPIO_2[0] PIN_A14 GPIO Connection DATA[0] 3.3V
GPIO_2[1] PIN_B16 GPIO Connection DATA[1] 3.3V
GPIO_2[2] PIN_C14 GPIO Connection DATA[2] 3.3V
GPIO_2[3] PIN_C16 GPIO Connection DATA[3] 3.3V
GPIO_2[4] PIN_C15 GPIO Connection DATA[4] 3.3V
GPIO_2[5] PIN_D16 GPIO Connection DATA[5] 3.3V
GPIO_2[6] PIN_D15 GPIO Connection DATA[6] 3.3V
GPIO_2[7] PIN_D14 GPIO Connection DATA[7] 3.3V
GPIO_2[8] PIN_F15 GPIO Connection DATA[8] 3.3V
GPIO_2[9] PIN_F16 GPIO Connection DATA[9] 3.3V
GPIO_2[10] PIN_F14 GPIO Connection DATA[10] 3.3V
GPIO_2[11] PIN_G16 GPIO Connection DATA[11] 3.3V
GPIO_2[12] PIN_G15 GPIO Connection DATA[12] 3.3V
GPIO_2_IN[0] PIN_E15 GPIO Input 3.3V
GPIO_2_IN[1] PIN_E16 GPIO Input 3.3V
GPIO_2_IN[2] PIN_M16 GPIO Input 3.3V

Table 3-9 Pin Assignments for ADC

Signal Name FPGA Pin No. Descritption I/O Standard
ADC_CS_N PIN_A10 Chip select 3.3V
ADC_SADDR PIN_B10 Digital data input 3.3V
ADC_SDAT PIN_A9 Digital data output 3.3V
ADC_SCLK PIN_B14 Digital clock input 3.3V

JTAG

de0-nano-jtag-block.png

OpenRisc

GCC工具链

1
2
3
~$ git clone https://github.com/stffrdhrn/or1k-toolchain-build
~$ cd or1k-toolchain-build
~$ docker build -t or1k-toolchain-build or1k-toolchain-build/
  • 设置挂载目录变量,运行容器编译.如果build-gcc.sh内的资源链接失效了,需要找一个替代修改它,如:QEMU_URL=https://github.com/vamanea/qemu-or32/archive/v2.0.2.tar.gz.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# The location where you have tarballs, so they dont need to be
# downloaded every time you build
CACHEDIR=/home/user/work/docker/volumes/src
# The location where you want your output to go
OUTPUTDIR=/home/user/work/docker/volumes/crosstool

docker run -it --rm \
-e MUSL_ENABLED=1 \
-e NEWLIB_ENABLED=1 \
-e NOLIB_ENABLED=1 \
-e GCC_VERSION=9.0.1 \
-e BINUTILS_VERSION=2.32.51 \
-e LINUX_HEADERS_VERSION=4.19.1 \
-e MUSL_VERSION=1.1.20 \
-e GMP_VERSION=6.1.2 \
-v ${OUTPUTDIR}:/opt/crosstool:Z \
-v ${CACHEDIR}:/opt/crossbuild/cache:Z \
or1k-toolchain-build
  • 编译成功后,如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
ls
or1k-elf-9.0.1-20210204.tar.xz
or1k-elf-gcc-9.0.1-20210204.log.gz
or1k-elf-gcc-9.0.1-20210204.sum
or1k-elf-gxx-9.0.1-20210204.log.gz
or1k-elf-gxx-9.0.1-20210204.sum
or1k-linux-9.0.1-20210204.tar.xz
or1k-linux-musl-9.0.1-20210204.tar.xz
or1k-linux-musl-gcc-9.0.1-20210203.log.gz
or1k-linux-musl-gcc-9.0.1-20210203.sum
or1k-linux-musl-gcc-9.0.1-20210204.log.gz
or1k-linux-musl-gcc-9.0.1-20210204.sum
or1k-linux-musl-gxx-9.0.1-20210203.log.gz
or1k-linux-musl-gxx-9.0.1-20210203.sum
or1k-linux-musl-gxx-9.0.1-20210204.log.gz
or1k-linux-musl-gxx-9.0.1-20210204.sum
relnotes-9.0.1-20210204.md

  • 这里把or1k-linux-9.0.1-20210204.tar.xz解压安装到本地,并设置相应的环境变量如下:
    1
    2
    3
    4
    5
    6
    export ALTERA_PATH="/home/michael/3TB-DISK/intelFPGA_lite/20.1/"
    export PATH=$PATH:$ALTERA_PATH/quartus/bin

    export ARCH=openrisc
    export CROSS_COMPILE=or1k-linux-
    export PATH=$PATH:`pwd`/toolchain-rootfs/or1k-linux/bin
  • 或者单独编译or1k-gcc
    1
    2
    3
    4
    5
    ~$ git clone https://github.com/openrisc/or1k-gcc
    ~$ cd or1k-gcc/
    ~$ mkdir build-linux
    ~$ cd build-linux && ../configure && make -j4
    ~$ make install DESTPATH=<absolute path>

编译ORPSOC

1
2
3
4
~$ git clone https://github.com/mczerski/orpsoc-de0_nano
~$ export ALTERA_PATH="/home/fullpath/QuartusIIWebEdition13.0.0.156/quartus"
~$ export PATH=$PATH:$ALTERA_PATH/bin
~$ make OR32_TOOL_PREFIX=or1k-linux- all

编译Linux

1
2
3
4
5
6
7
8
~$ tar xvf linux-4.16.14.tar.xz
~$ cd linux-4.16.14
~$ wget -c https://kevinmehall.net/openrisc/guide/de0_nano.dts.txt -O arch/openrisc/boot/dts/de0_nano.dts

~$ make ARCH=openrisc CROSS_COMPILE="or1k-linux-" or1ksim_defconfig
/** Select Processor type and Features -> Builtin DTB and type de0_nano */
~$ make ARCH=openrisc CROSS_COMPILE="or1k-linux-" menuconfig
~$ make ARCH=openrisc CROSS_COMPILE="or1k-linux-"

烧写入bitstream

1
2
3
~$ cd orpsoc/boards/altera/de0_nano/syn/quartus/run
~$ make OR32_TOOL_PREFIX=or1k-linux- all
~$ make pgm

连接串号

  • 根据上面GPIO的管脚定义,以及下面的信息,连接正确的rx,tx.
1
2
3
4
5
6
cat boards/altera/de0_nano/syn/quartus/tcl/UART0_pin_assignments.tcl
set_location_assignment PIN_D8 -to uart0_srx_pad_i
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to uart0_srx_pad_i
set_location_assignment PIN_F8 -to uart0_stx_pad_o
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to uart0_stx_pad_o

FuseSOC 试用

安装fuseSoc

1
2
3
4
5
6
7

~$ git clone https://github.com/olofk/fusesoc

~$ cd fusesoc && pip install -e .

OR
~$ pip install fusesoc

安装fusesoc 库

  • 从网络安装.
1
~$ fusesoc library add intgen https://github.com/openrisc/intgen.git
  • 从本地安装
1
2
3
4
5
~$ git clone https://github.com/openrisc/mor1kx-generic.git
~$ git clone https://github.com/openrisc/or1k_marocchino.git

~$ fusesoc library add mor1kx-generic `pwd`/mor1kx-generic
~$ fusesoc library add or1k_marocchino `pwd`/or1k_marocchino
1
2
3
4
5
6
7
8
9
10
11
12
~$ fusesoc list-cores
Available cores:

Core Cache status Description
================================================================================
::blinky:0 : local : <No description>
::intgen:0 : local : Interrupt Generator For testing Processors
::mor1kx-generic:1.1 : local : Minimal mor1kx simulation environment
::or1k_marocchino:5.0-r3 : local : <No description>
::plights:0 : local : <No description>
::rv_sopc:0 : local : RISC V system on programmable chip example
::wb_intercon_gen_ng:0 : local : CAPI=2 .core file description based Wishbone Interconnect generator
  • 查看fusesoc.conf配置.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
~$ cat fusesoc.conf
[library.mor1kx-generic]
location = /fullpath/FPGA-DE0-Nano/openrisc/mor1kx-generic
sync-uri = /fullpath/FPGA-DE0-Nano/openrisc/mor1kx-generic
sync-type = local
auto-sync = true

[library.or1k_marocchino]
location = /fullpath/FPGA-DE0-Nano/openrisc/or1k_marocchino
sync-uri = /fullpath/FPGA-DE0-Nano/openrisc/or1k_marocchino
sync-type = local
auto-sync = true

[library.intgen]
location = /fullpath/FPGA-DE0-Nano/openrisc/intgen
sync-uri = /fullpath/FPGA-DE0-Nano/openrisc/intgen
sync-type = local
auto-sync = true

[library.fusesoc-demos]
location = fusesoc_libraries/fusesoc-demos
sync-uri = https://github.com/Oxore/fusesoc-demos
sync-type = git
auto-sync = true

Fusesoc-demos

1
fusesoc library add  https://github.com/Oxore/fusesoc-demos

RISC-V

安装Quartus20

litex-boards测试编译

  • 这里使用litex-hub里的项目测试,Quartus安装到~/intelFPGA_lite/20.1/quartus目录下.注意,加上--load参数项,必须先运行jtagd服务,否则quartus_pgm无法进行jtag烧写.
1
2
3
4
~$ export PATH=~/riscv64-toolchain/bin:~/intelFPGA_lite/20.1/quartus/bin:$PATH
~$ jtagd --foreground --debug
~$ cd litex-boards/litex_boards
~$ targets/terasic_de0nano.py --uart-name=jtag_uart --build --load
  • 其实最终的编译脚本是如下内容.
1
2
3
4
5
6
7
8
9
10
linux-on-litex-vexriscv$ cat build/de0nano/gateware/build_de0nano.sh
# Autogenerated by LiteX / git: 55a79030
quartus_map --read_settings_files=on --write_settings_files=off de0nano -c de0nano
quartus_fit --read_settings_files=off --write_settings_files=off de0nano -c de0nano
quartus_asm --read_settings_files=off --write_settings_files=off de0nano -c de0nano
quartus_sta de0nano -c de0nano
if [ -f "de0nano.sof" ]
then
quartus_cpf -c de0nano.sof de0nano.rbf
fi

linux-on-litex-vexriscv编译测试

1
2
3
~$ cd linux-on-litex-vexriscv
~$ export PATH=~/riscv64-toolchain/bin:~/intelFPGA_lite/20.1/quartus/bin:$PATH
~$ ./make.py --board=de0nano --build --load
  • 运行--load参数时,需要确保jtagd是运行的,这里如下面所示,最终是使用quartus_pgm -m jtag -c USB-Blaster加载的.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    [...]
    Info: Command: quartus_pgm -m jtag -c USB-Blaster -o p;/home/michael/workspace-xilinx/RISC-V/litex-hub/litex/linux-on-litex-vexriscv/build/de0nano/gateware/de0nano.sof@1
    Info (213046): Using programming cable "USB-Blaster on 127.0.0.1 [3-3]"
    Info (213011): Using programming file /home/michael/workspace-xilinx/RISC-V/litex-hub/litex/linux-on-litex-vexriscv/build/de0nano/gateware/de0nano.sof with checksum 0x0085AEAF for device EP4CE22F17@1
    Info (209060): Started Programmer operation at Sat Feb 26 11:35:48 2022
    Info (209016): Configuring device index 1
    Info (209017): Device 1 contains JTAG ID code 0x020F30DD
    Info (209007): Configuration succeeded -- 1 device(s) configured
    Info (209011): Successfully performed operation(s)
    Info (209061): Ended Programmer operation at Sat Feb 26 11:35:49 2022
    Info: Quartus Prime Programmer was successful. 0 errors, 0 warnings
    Info: Peak virtual memory: 315 megabytes
    Info: Processing ended: Sat Feb 26 11:35:49 2022
    Info: Elapsed time: 00:00:32
    Info: Total CPU time (on all processors): 00:00:00

通过JTAG-UART查看启动信息

  • 这里脚本默认是选择serial进行通信的,上面我在编译时选择了--uart-name=jtag_uart,想测试使用板的上USB JTAG的方式来进行uart通信.
1
2
3
4
5
6
7
8
9
10
11
~$ ./targets/terasic_de0nano.py --help
[...]
--no-uart
Disable UART. (default: False)
--uart-name UART_NAME
UART type/name. (default: serial)
--uart-baudrate UART_BAUDRATE
UART baudrate. (default: 115200)
--uart-fifo-depth UART_FIFO_DEPTH
UART FIFO depth. (default: 16)
[...]
  • 新建一个openocd的配置文件,针对de0nano EP4CE22F17的参数设置,如下:
1
2
3
4
5
6
7
8
9
10
11
litex_boards$ cat prog/openocd_de0nano.cfg
adapter driver usb_blaster
usb_blaster lowlevel_driver ftdi
set _CHIPNAME EP4CE22F17
set FPGA_TAPID 0x020F30DD
adapter speed 6000
jtag newtap $_CHIPNAME tap -irlen 10 -expected-id $FPGA_TAPID
#scan_chain
gdb_port disabled
tcl_port disabled

  • 连接如下
1
2
3
4
5
6
7
8
9
10
11
12
13
14
 ~/.local/bin/litex_term jtag --jtag-config=./prog/openocd_de0nano.cfg
port is 20000
got ir value 2
Open On-Chip Debugger 0.11.0+dev-00562-g5ab74bde0-dirty (2022-02-07-19:44)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : only one transport option; autoselect 'jtag'
jtagstream_serve
Info : usb blaster interface using libftdi
Info : This adapter doesn't support configurable speed
Info : JTAG tap: EP4CE22F17.tap tap/device found: 0x020f30dd (mfg: 0x06e (Altera), part: 0x20f3, ver: 0x0)
Warn : gdb services need one or more targets defined

  • 连接到jtag-uart就直接挂住了,没有出来串口终端, 而通过litex_term连接jtag的参数是包装了openocd的命令行,等价与如下命令:
1
~$ openocd -f ./prog/openocd_de0nano.cfg -f stream.cfg -c <....>
  • 因为--build完成后会在当前目录下生成stream.cfg文件,它就是用TCL脚本定义的openocd的配置文件,它的自动生成来源是位于:litex/litex/build/openocd.py里, 片段如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
litex$ tail -n 20 litex/litex/build/openocd.py
}

proc jtagstream_serve {tap port} {
set sock [socket stream.server $port]
$sock readable [list jtagstream_client $tap $sock]
stdin readable [list jtagstream_exit $sock]
vwait forever
$sock close
}
"""
write_to_file("stream.cfg", cfg)
print("port is {:d}".format(port))
print("got ir value {:d}".format(self.get_ir(chain,config)))
script = "; ".join([
"init",
"irscan $_CHIPNAME.tap {:d}".format(self.get_ir(chain, config)),
"jtagstream_serve $_CHIPNAME.tap {:d}".format(port),
"exit",
])
self.call(["openocd", "-f", config, "-f", "stream.cfg", "-c", script])
  • 想通过板上的USB接口,连接jtag_uart的方式不成功.

通过串口查看系统启动信息

  • 因为使用jtag_uart方式连接串口不成功,还是选择默认的serial方式连接,因为板上没有USB to UART的功能,看相关文档也没说明如何连接到板上的uart,通过搜索源码发现如下的定义:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
litex-boards$ cat litex_boards/platforms/terasic_de0nano.py
[...]
# Switches
("sw", 0, Pins("M1"), IOStandard("3.3-V LVTTL")),
("sw", 1, Pins("T8"), IOStandard("3.3-V LVTTL")),
("sw", 2, Pins("B9"), IOStandard("3.3-V LVTTL")),
("sw", 3, Pins("M15"), IOStandard("3.3-V LVTTL")),

# Serial
("serial", 0,
# Compatible with cheap FT232 based cables (ex: Gaoominy 6Pin Ftdi Ft232Rl Ft232)
# GND on JP1 Pin 12.
Subsignal("tx", Pins("JP1:10"), IOStandard("3.3-V LVTTL")),
Subsignal("rx", Pins("JP1:8"), IOStandard("3.3-V LVTTL"))
),

# SDR SDRAM
("sdram_clock", 0, Pins("R4"), IOStandard("3.3-V LVTTL")),
[...]
  • 这里使用FTDI 2232H连接如下:
1
2
3
4
DE0-nano JP1                FT2232H
GPIO_05 Pin8 <-------> AD0 TXD
GPIO_07 Pin10 <-------> AD1 RXD
GND Pin12 <-------> GND
  • 到这里就可以使用litex_term或者minicom来连接板上串口了,如果出现乱码,就是UART baudrate问题,这里是默认其实是1Mbps(1e6),而且发现在某宝买的很多USB to UART在连接1Mbps还是会出现乱码,不能输入等问题,我换成FTDI 2232H就可以正常使用了.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
litex-boards $ minicom -o -b 1000000 -D /dev/ttyUSB0

__ _ __ _ __
/ / (_) /____ | |/_/
/ /__/ / __/ -_)> <
/____/_/\__/\__/_/|_|
Build your hardware, easily!

(c) Copyright 2012-2022 Enjoy-Digital
(c) Copyright 2007-2015 M-Labs

BIOS CRC passed (1f65f3e6)

Migen git sha1: ac70301
LiteX git sha1: 7cc781f7

--=============== SoC ==================--
CPU: VexRiscv SMP-LINUX @ 50MHz
BUS: WISHBONE 32-bit @ 4GiB
CSR: 32-bit data
ROM: 64KiB
SRAM: 8KiB
L2: 2KiB
SDRAM: 32768KiB 16-bit @ 50MT/s (CL-2 CWL-2)

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
Write: 0x40000000-0x40200000 2.0MiB
Read: 0x40000000-0x40200000 2.0MiB
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
Write speed: 15.8MiB/s
Read speed: 11.7MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
Timeout
No boot medium found

--============= Console ================--

litex>

  • 可以通过help查看可以支持的命令
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
litex> help
LiteX BIOS, available commands:

help - Print this help
ident - Identifier of the system
crc - Compute CRC32 of a part of the address space
flush_cpu_dcache - Flush CPU data cache
flush_l2_cache - Flush L2 cache
leds - Set Leds value

boot - Boot from Memory
reboot - Reboot
serialboot - Boot from Serial (SFL)

mem_list - List available memory regions
mem_read - Read address space
mem_write - Write address space
mem_copy - Copy address space
mem_test - Test memory access
mem_speed - Test memory speed
mem_cmp - Compare memory content

sdram_test - Test SDRAM


litex> ident

Ident: LiteX SoC on DE0-Nano

  • leds命令可以控制板上的led0~led7的开关,
1
2
3
4
5
6
7
8
9
10
litex> leds 255  # 全部亮灯

Settings Leds to 0xff
litex> leds 1 # led0 亮

Settings Leds to 0x1

litex> leds 11 # led0,led1,led3亮, (1 << 0) + (1 << 1) + (1 << 3)

Settings Leds to 0xb

添加SPI-SDCard外设

  • de0-nano板子上没有接sdcard的插槽,这里给它在JP1 Headers上连接一个SPI-SDCard插槽,并且让它能从SDCard boot方式,加载Linux.所以需修改相应的源码,先是在linux-on-litex-vexriscv/make.py里的De0Nano类下面的参数soc_capabilities添加spisdcard,如下:
1
2
3
4
5
6
7
8
9
10
11
12
litex-hub/linux-on-litex-vexriscv$ cat make.py
[...]
class De0Nano(Board):
soc_kwargs = {"l2_size" : 2048} # Use Wishbone and L2 for memory accesses.
def __init__(self):
from litex_boards.targets import de0nano
Board.__init__(self, de0nano.BaseSoC, soc_capabilities={
# Communication
"serial",
"spisdcard"
}, bitstream_ext=".sof")
[...]
  • 再去到litex-hub/litex-boards的项目下,添加如下补丁修改
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
litex-hub/litex-boards$ git diff litex_boards/platforms/terasic_de0nano.py
diff --git a/litex_boards/platforms/terasic_de0nano.py b/litex_boards/platforms/terasic_de0nano.py
index 3284ddc..7a810f7 100644
--- a/litex_boards/platforms/terasic_de0nano.py
+++ b/litex_boards/platforms/terasic_de0nano.py
@@ -115,6 +115,14 @@ _io = [
"F15 F16 F14 G16 G15"),
IOStandard("3.3-V LVTTL")
),
+ # SDCard
+ ("spisdcard", 0,
+ Subsignal("clk", Pins("JP1:18")),
+ Subsignal("cs_n", Pins("JP1:20")),
+ Subsignal("mosi", Pins("JP1:14"), Misc("WEAK_PULL_UP_RESISTOR ON")),
+ Subsignal("miso", Pins("JP1:16"), Misc("WEAK_PULL_UP_RESISTOR ON")),
+ IOStandard("3.3-V LVTTL")
+ ),
]

# Connectors ---------------------------------------------------------------------------------------
  • 编译linux-on-litex-vexriscv
    1
    linux-on-litex-vexriscv$ ./make.py --board=de0nano --build
  • 完成后,先把一张SD卡格式成用fdisk修改分区类型为W95 FAT32,再用mkfs.fat格化它.再把linux-on-litex-vexriscv/images里面的文件复到SD卡的根目录里.SPI-SDCard模块与JP1接线如下:
1
2
3
4
5
6
7
SPI-Card Module              de0-nano JP1  FPGA Pin No
CS <----------> JP1:20 GPIO_015 C6
CLK <----------> JP1:18 GPIO_013 D6
SDO <----------> JP1:16 GPIO_011 A6
SDI <----------> JP1:14 GPIO_09 D5
GND <----------> GND
3V3 <----------> 3V3
  • 上面的接线可以参考litex_boards/platforms/terasic_de0nano.py里的代码与上面介绍里GPIO-0 Pin Assignments的描述去理解与开发.
1
linux-on-litex-vexriscv$ ./make.py --board=de0nano --load
  • 串口连接,并从SDCard booting

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    linux-on-litex-vexriscv$ minicom -o -b 1000000 -D /dev/ttyUSB0

    (c) Copyright 2012-2022 Enjoy-Digital
    (c) Copyright 2007-2015 M-Labs

    BIOS CRC passed (1038d38c)

    Migen git sha1: ac70301
    LiteX git sha1: 7f49c523

    --=============== SoC ==================--
    CPU: VexRiscv SMP-LINUX @ 50MHz
    BUS: WISHBONE 32-bit @ 4GiB
    CSR: 32-bit data
    ROM: 64KiB
    SRAM: 8KiB
    L2: 2KiB
    SDRAM: 32768KiB 16-bit @ 50MT/s (CL-2 CWL-2)

    --========== Initialization ============--
    Initializing SDRAM @0x40000000...
    Switching SDRAM to software control.
    Switching SDRAM to hardware control.
    Memtest at 0x40000000 (2.0MiB)...
    Write: 0x40000000-0x40200000 2.0MiB
    Read: 0x40000000-0x40200000 2.0MiB
    Memtest OK
    Memspeed at 0x40000000 (Sequential, 2.0MiB)...
    Write speed: 16.7MiB/s
    Read speed: 20.3MiB/s

    --============== Boot ==================--
    Booting from serial...
    Press Q or ESC to abort boot completely.
    sL5DdSMmkekro
    Timeout
    Booting from SDCard in SPI-Mode...
    Booting from boot.json...
    Copying Image to 0x40000000 (7531468 bytes)...
    [########################################]
    Copying rv32.dtb to 0x40ef0000 (2621 bytes)...
    [########################################]
    Copying rootfs.cpio to 0x41000000 (3786240 bytes)...
    [########################################]
    Copying opensbi.bin to 0x40f00000 (53640 bytes)...
    [########################################]
    Executing booted program at 0x40f00000
    [...]
  • 如果从SDCard启动失败,先确保卡的分区格式是W95 FAT32,再换一张卡测试一下,因为我这边使用一张512MB的旧卡,另一张是1GB的旧卡,都无法检测到,换了一张32GB,128GB卡,都能成功加载运行.好像enjoy-digital/litesdcard对旧卡兼容有问题,或者是其它未知的原因.

quartus_cpf命令

  • 查看参数帮助说明,如:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
~$ quartus_cpf --help=rpd

Topic: rpd

To generate a Raw Programming Data File (.rpd), specify the input
file name and output file name. Make sure the file extension
of the output file is .rpd. The input file can be only a
Programmer Object File (.pof).

---------
Examples:
---------

# To convert .pof to .rpd
quartus_cpf -c <input_pof_file> <output_rpd_file>

# To use a Conversion Setup File (.cof) created with
# the Convert Programming Files dialog box in the UI
quartus_cpf -c <input_cof_file>

  • 查看sof文件的信息
1
2
3
4
5
6
7
8
9
10
~$ quartus_cpf --info de0nano.sof
File: de0nano.sof
File CRC: 0x24F3
Creator: Quartus Prime Compiler Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
Comment: Untitled
Device: EP4CE22F17
Data checksum: 0x008595BA
JTAG usercode: 0x008595BA
Project Hash: 0x

  • 生成svf文件
1
~$ quartus_cpf -c -q 6.0MHz -g 3.3 -n p de0nano.sof de0nano.svf
  • 生成rpd文件
1
2
~$ quartus_cpf -c -d EPCS64 de0nano.sof de0nano.pof
~$ quartus_cpf -c -d EPCS64 -s EP4CE22F17 de0nano.pof de0nano.rpd
  • 生成jic文件,可以使用Quartus Prime IDE -> Tools -> Programmer -> Add File...进行烧写,需确保jtagd服务是运行的.
    1
    ~$ quartus_cpf -c -d EPCS64 -s EP4CE22F17 de0nano.sof de0nano.jic

openocd加载svf文件.

  • 根据板子参数,创建一个openocd的连接配置文件.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    ~$ cat > openocd_de0nano.cfg <<EOF
    adapter driver usb_blaster
    usb_blaster lowlevel_driver ftdi
    set CHIPNAME EP4CE22F17
    set FPGA_TAPID 0x020F30DD # 通过jtagconfig取得
    adapter speed 6000
    jtag newtap $CHIPNAME tap -irlen 10 -expected-id $FPGA_TAPID
    init
    scan_chain

    EOF
  • 加载到FPGA.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
~$ openocd -f ./openocd_de0nano.cfg -c "svf  de0nano.svf progress" -c exit
Open On-Chip Debugger 0.11.0+dev-00562-g5ab74bde0-dirty (2022-02-07-19:44)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : only one transport option; autoselect 'jtag'
Info : usb blaster interface using libftdi
Info : This adapter doesn't support configurable speed
Info : JTAG tap: EP4CE22F17.tap tap/device found: 0x020f30dd (mfg: 0x06e (Altera), part: 0x20f3, ver: 0x0)
Warn : gdb services need one or more targets defined
TapName Enabled IdCode Expected IrLen IrCap IrMask
-- ------------------- -------- ---------- ---------- ----- ----- ------
0 EP4CE22F17.tap Y 0x020f30dd 0x020f30dd 10 0x01 0x03

svf processing file: "de0nano.svf"
0% FREQUENCY 1.20E+07 HZ;
Error: Translation from khz to adapter speed not implemented

0% TRST ABSENT;
0% ENDDR IDLE;
0% ENDIR IRPAUSE;
0% STATE IDLE;
0% SIR 10 TDI (002);
0% RUNTEST IDLE 12000 TCK ENDSTATE IDLE;
95% FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF);
95% SIR 10 TDI (004);
95% RUNTEST 60 TCK;
95% 000000000000000000000000000000000000000000000000000000000000000000);
95% SIR 10 TDI (003);
95% RUNTEST 49152 TCK;
95% RUNTEST 512 TCK;
95% SIR 10 TDI (3FF);
95% RUNTEST 12000 TCK;
95% STATE IDLE;

Time used: 0m1s439ms
svf file programmed successfully for 17 commands with 0 errors

烧写到SPI FLASH

  • openFPGALoader Intel/Altera
  • 这里使用openFPGALoader -b de0nano -f de0nano.rpd显示烧写错误:flash stackflow,后面使用下面的命令就可以正常烧写到Flash.
1
~$ openFPGALoader -c usb-blaster --fpga-part ep4ce2217 -f  de0nano.rbf

UrJtag使用

  • 直接使用apt-get install urjtag的版本较老,是不支持ep4c22,显示如下:
1
2
3
4
5
6
7
8
9
10
~$ jtag
jtag> cable UsbBlaster vid=0x09fb pid=0x6001 interface=0
Connected to libftdi driver.
jtag> detect
IR length: 10
Chain length: 1
Device Id: 00000010000011110011000011011101 (0x020F30DD)
Manufacturer: Altera (0x0DD)
Unknown part! (0010000011110011) (/usr/share/urjtag/altera/PARTS)

  • 通过参照这里,从最新(urjtag-2021.03)源码去编译它,这里还需要去FTDI的官网去下载D2XX Drivers

  • 下载D2XX Drivers

1
2
3
4
5
6
~$ wget -c https://ftdichip.com/wp-content/uploads/2021/09/libftd2xx-x86_64-1.4.24.tgz
~$ tar xvf libftd2xx-x86_64-1.4.24.tgz
release/
release/release-notes.txt
release/WinTypes.h
[...]
  • 编译安装urjtag-2021.03
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
~$ wget -c https://sourceforge.net/projects/urjtag/files/urjtag/2021.03/urjtag-2021.03.tar.xz/download
~$ tar xvf urjtag-2021.03.tar.xz
~$ cd urjtag-2021.03
~4 CLFAGS=-I$PWD/../release LDFLAGS="-L$PWD/../release/build -lftd2xx" ./configure --with-libusb --with-libftdi --with-ftd2xx
[...]
Libraries:
libusb : 1.0
libftdi : yes (have async mode)
libftd2xx : yes
inpout32 : no

Subsystems:
SVF : yes
BSDL : yes
STAPL : no

Drivers:
Bus : ahbjtag arm9tdmi au1500 avr32 bcm1250 blackfin bscoach ejtag ejtag_dma fjmem ixp425 ixp435 ixp465 jopcyc h7202 lh7a400 mpc5200 mpc824x mpc8313 mpc837x ppc405ep ppc440gx_ebc8 prototype pxa2x0 pxa27x s3c4510 sa1110 sh7727 sh7750r sh7751r sharc_21065L sharc_21369_ezkit slsup3 tx4925 zefant_xs3
Cable : arcom byteblaster dirtyjtag dlc5 ea253 ei012 ft2232 gpio ice100 igloo jlink keithkoep lattice mpcbdm triton usbblaster vsllink wiggler xpc
Lowlevel : direct ftdi ftd2xx ppdev

Language bindings:
python : yes
~$ make && make install

  • 应用ep4ce22描述文件的补丁
1
2
3
4
5
6
7
~$ cd /usr/local/share/urjtag$
~$ sudo patch -p1 < ~/urjtag-descriptors.patch
patching file altera/ep4ce22/ep4ce22
patching file altera/ep4ce22/STEPPINGS
patching file altera/PARTS
Hunk #1 succeeded at 28 (offset 2 lines).
michael@debian:/usr/local/share/urjtag$ /usr/local/bin/jtag
  • 补丁文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
~$ cat urjtag-descriptors.patch
diff -Naur urjtag-orig/altera/ep4ce22/ep4ce22 urjtag/altera/ep4ce22/ep4ce22
--- urjtag-orig/altera/ep4ce22/ep4ce22 1970-01-01 10:00:00.000000000 +1000
+++ urjtag/altera/ep4ce22/ep4ce22 2014-07-30 21:48:09.652857260 +1000
@@ -0,0 +1,12 @@
+instruction length 10
+register DIR 32
+register USERCODE 32
+register BSR 732
+register BYPASS 1
+instruction HIGHZ 0000001011 BYPASS
+instruction CLAMP 0000001010 BYPASS
+instruction USERCODE 0000000111 USERCODE
+instruction IDCODE 0000000110 DIR
+instruction SAMPLE/PRELOAD 0000000101 BSR
+instruction EXTEST 0000001111 BSR
+instruction BYPASS 1111111111 BYPASS
diff -Naur urjtag-orig/altera/ep4ce22/STEPPINGS urjtag/altera/ep4ce22/STEPPINGS
--- urjtag-orig/altera/ep4ce22/STEPPINGS 1970-01-01 10:00:00.000000000 +1000
+++ urjtag/altera/ep4ce22/STEPPINGS 2014-07-30 21:48:09.644857260 +1000
@@ -0,0 +1,23 @@
+#
+# $Id: STEPPINGS 897 2007-12-29 13:02:32Z arniml $
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# of the License, or (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+# 02111-1307, USA.
+#
+# Written by H Hartley Sweeten <hsweeten@visionengravers.com>
+#
+
+# bits 31-28 of the Device Identification Register
+0000 ep4ce22 0
diff -Naur urjtag-orig/altera/PARTS urjtag/altera/PARTS
--- urjtag-orig/altera/PARTS 2014-07-28 22:19:56.968449502 +1000
+++ urjtag/altera/PARTS 2014-07-30 21:48:08.464857263 +1000
@@ -26,3 +26,4 @@
0111000100101000 epm7128aetc100 EPM7128AETC100
0111000001100100 epm3064a EPM3064A
0010000010110010 ep2c8 EP2C8
+0010000011110011 ep4ce22 EP4CE22

  • 运行新版UrJtag
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
~$ /usr/local/bin/jtag

UrJTAG 2021.03 #
Copyright (C) 2002, 2003 ETC s.r.o.
Copyright (C) 2007, 2008, 2009 Kolja Waschk and the respective authors

UrJTAG is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
There is absolutely no warranty for UrJTAG.

warning: UrJTAG may damage your hardware!
Type "quit" to exit, "help" for help.

jtag> cable UsbBlaster vid=0x09fb pid=0x6001 interface=0
Connected to libftdi driver.
jtag> detect
IR length: 10
Chain length: 1
Device Id: 00000010000011110011000011011101 (0x020F30DD)
Manufacturer: Altera (0x0DD)
Part(0): EP4CE22 (0x20F3)
Stepping: 0
Filename: /usr/local/share/urjtag/altera/ep4ce22/ep4ce22
jtag> cable usbblaster driver=ftdi
Connected to libftdi driver.
jtag> detect
IR length: 10
Chain length: 1
Device Id: 00000010000011110011000011011101 (0x020F30DD)
Manufacturer: Altera (0x0DD)
Part(0): EP4CE22 (0x20F3)
Stepping: 0
Filename: /usr/local/share/urjtag/altera/ep4ce22/ep4ce22
jtag> print chain
No. Manufacturer Part Stepping Instruction Register
-------------------------------------------------------------------------------------------------------------------
* 0 Altera EP4CE22 0 BYPASS BYPASS

其它项目

USB Blaster连接问题

  • Altera Design Software

  • 因为这里只想使用quartus_pgm命令,就只下载了QuartusProgrammerSetup-16.1.0.196-linux.run.

1
2
3
4
5
6
~$ jtagd --foreground --debug

~$ ./jtagd --user-start --foreground
~$ ./jtagconfig
Error (Server error) when scanning hardware

  • 测看系统日志
1
2
3
4
5
6
7
8
9
10
~$ dmesg
[...]
[25811.819181] usb 4-2: USB disconnect, device number 16
[25814.375520] usb 4-2: new full-speed USB device number 17 using xhci_hcd
[25814.550270] usb 4-2: New USB device found, idVendor=09fb, idProduct=6001, bcdDevice= 4.00
[25814.550283] usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[25814.550289] usb 4-2: Product: USB-Blaster
[25814.550293] usb 4-2: Manufacturer: Altera
[25814.550297] usb 4-2: SerialNumber: 91d28408

  • 直接运行jtagconfig命令,就出现下面这个错.

    1
    2
    ~$ ./jtagconfig
    Error when scanning hardware - Server error
  • 然后用strace运行只过滤查看network运行情况如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
~$ strace -e trace=network jtagconfig
[...]
si_stime=0} ---
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(3, SOL_SOCKET, SO_LINGER, {l_onoff=1, l_linger=10}, 8) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(1309), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(4, SOL_SOCKET, SO_LINGER, {l_onoff=1, l_linger=10}, 8) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(1309), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
recvfrom(3, "", 2, 0, NULL, NULL) = 0
getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
recvfrom(4, "", 2, 0, NULL, NULL) = 0
recvfrom(-1, 0x1e4ca0c, 2, 0, NULL, NULL) = -1 EBADF (Bad file descriptor)

  • 为排除硬件问题,又没有第二台电脑系统可试.使用Virtualbox安装了一个同版本的系统测试,发现在虚拟机里不做任何设置,就可以正常发现设备.如下:
    1
    2
    3
    4
    ~$ ./jtagconfig
    1) USB-Blaster [2-2]
    Unable to read device chain - JTAG chain broken

  • 再次按装官方文档,安装添加udev相关设置,再把jtagd开启调试模式如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
~# cat>/etc/udev/rules.d/51-altera-usb-blaster.rules<<EOF
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6001", MODE="0666"
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6002", MODE="0666"
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6003", MODE="0666"
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6010", MODE="0666"
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6810", MODE="0666"
EOF

~$ jtagd --foreground --debug
JTAG daemon started
Using config file /etc/jtagd/jtagd.conf
Remote JTAG permitted when password set
USB-Blaster "USB-Blaster" firmware version 4.00
USB-Blaster endpoints out=02(64), in=81(64); urb size=1024
USB-Blaster added "USB-Blaster [4-2]"
USB-Blaster port (/dev/bus/usb/004/017) opened

  • 但是直接运行jtagconfig还是会报Error when scanning hardware - Server error错误.后面按照上述文档进行下面的设置就可以了.

jtagd服务端配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
~# cp /fullpath/intelFPGA_lite/16.1/qprogrammer/linux64/pgm_parts.txt /etc/jtagd/jtagd.pgm_parts
~# echo "Password = \"123456\";" > /etc/jtagd/jtagd.conf
~# killall -9 jtagd
~$ jtagd --foreground --debug
JTAG daemon started
Using config file /etc/jtagd/jtagd.conf
Remote JTAG permitted when password set
USB-Blaster "USB-Blaster" firmware version 4.00
USB-Blaster endpoints out=02(64), in=81(64); urb size=1024
USB-Blaster added "USB-Blaster [6-2]"
USB-Blaster port (/dev/bus/usb/006/002) opened
USB-Blaster "USB-Blaster" firmware version 4.00
USB-Blaster endpoints out=02(64), in=81(64); urb size=1024
USB-Blaster reports JTAG protocol version 0, using version 0

jtagconfig配置

1
2
3
4
~$ jtagconfig --addserver 127.0.0.1 123456
~$ jtagconfig
1) USB-Blaster on 127.0.0.1 [6-2]
020F30DD EP3C25/EP4CE22

谢谢支持

  • 微信二维码:

机器学习框架介绍

CUDA

NVIDIA CUDA Toolkit Release Notes
Linux 安装指导

本机系统

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
~$ nvidia-debugdump -l
Found 1 NVIDIA devices
Device ID: 0
Device name: Quadro P600 (*PrimaryCard)
GPU internal ID: 0422018092726

~$ cat /etc/*release
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18+deb9u1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)

安装CUDA

  • 根据它官方的指导选择安装适合版本.

Table 1. CUDA Toolkit and Compatible Driver Versions

CUDA Toolkit Linux x86_64 Driver Version Windows x86_64 Driver Version
CUDA 10.0.130 >= 410.48 >= 411.31
CUDA 9.2 (9.2.148 Update 1) >= 396.37 >= 398.26
CUDA 9.2 (9.2.88) >= 396.26 >= 397.44
CUDA 9.1 (9.1.85) >= 390.46 >= 391.29
CUDA 9.0 (9.0.76) >= 384.81 >= 385.54
CUDA 8.0 (8.0.61 GA2) >= 375.26 >= 376.51
CUDA 8.0 (8.0.44) >= 367.48 >= 369.30
CUDA 7.5 (7.5.16) >= 352.31 >= 353.66
CUDA 7.0 (7.0.28) >= 346.46 >= 347.62
1
~$ dpkg -l | grep "nvidia" | awk '{print $2}' | xargs sudo dpkg --purge
  • 测试安装
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
~$ nvidia-smi
Fri Nov 23 11:00:29 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P600 Off | 00000000:01:00.0 On | N/A |
| 34% 31C P8 N/A / N/A | 401MiB / 1999MiB | 4% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 898 G /usr/lib/xorg/Xorg 156MiB |
| 0 13787 G ...uest-channel-token=15869920746181936845 96MiB |
| 0 16550 G ...-token=D890EF91A7BB8E03F6D8D7795CC12E48 145MiB |
+-----------------------------------------------------------------------------+

  • 安装相应版本的NCCL.
1
~$ tar xvf cudnn-10.0-linux-x64-v7.4.1.5.tgz -C /usr/local

Debian Buster下安装Nvidia-tesla驱动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
~$ apt-get install nvidia-tesla-460-kernel-dkms nvidia-tesla-460-driver libnvidia-tesla-460-cuda1 nvidia-xconfig nvidia-tesla-460-smi
~$ nvidia-smi
Sat Mar 27 21:01:55 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GT 1030 On | 00000000:05:00.0 On | N/A |
| N/A 38C P5 N/A / 30W | 449MiB / 2000MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 3365 G /usr/lib/xorg/Xorg 326MiB |
| 0 N/A N/A 10847 G ...AAAAAAAAA= --shared-files 22MiB |
| 0 N/A N/A 15542 G ...chael/firefox/firefox-bin 0MiB |
| 0 N/A N/A 16429 G ...AAAAAAAA== --shared-files 97MiB |
+-----------------------------------------------------------------------------+

~$ sudo nvidia-xconfig # 创建/etc/X11/xorg.conf , 手动修改会导致不能启动图形界面,或者使用vdpauinfo.

安装dkms驱动出错

1
2
3
4
/var/lib/dkms/nvidia-tesla-460/460.91.03/build/common/inc/nv-misc.h:20:12: fatal error: stddef.h: No such file or directory
514 20 | #include <stddef.h> // NULL
515 | ^~~~~~~~~~

  • 上面错误,是没有包含/usr/src/<linux-5.17-SRC>/include/linux的头文件,下面是一个驱动的patch集合,需要把下面这个文件加入到/usr/src/nvidia-tesla-460-460.91.03/dkms.conf内。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
~$ cat nvidia-tesla-460-460.91.03/patches/nvidia-tesla-460-linux-5.17-combind.patch
diff -u a/Kbuild b/Kbuild
--- a/Kbuild 2021-07-02 14:04:57.000000000 +0800
+++ a/Kbuild 2022-05-15 12:38:09.968486119 +0800
@@ -68,7 +68,7 @@

EXTRA_CFLAGS += -I$(src)/common/inc
EXTRA_CFLAGS += -I$(src)
-EXTRA_CFLAGS += -Wall -MD $(DEFINES) $(INCLUDES) -Wno-cast-qual -Wno-error -Wno-format-extra-args
+EXTRA_CFLAGS += -Wall -MD $(DEFINES) $(INCLUDES) -Wno-cast-qual -Wno-error -Wno-format-extra-args -I./include/linux
EXTRA_CFLAGS += -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"460.91.03\" -Wno-unused-function -Wuninitialized -fno-strict-aliasing -mno-red-zone -mcmodel=kernel -DNV_UVM_ENABLE
EXTRA_CFLAGS += $(call cc-option,-Werror=undef,)
EXTRA_CFLAGS += -DNV_SPECTRE_V2=$(NV_SPECTRE_V2)

diff -ruN a/nvidia-uvm/uvm_linux.h b/nvidia-uvm/uvm_linux.h
--- a/nvidia-uvm/uvm_linux.h 2021-07-02 14:07:31.000000000 +0800
+++ b/nvidia-uvm/uvm_linux.h 2021-09-04 00:24:32.426673346 +0800
@@ -485,7 +485,7 @@
#elif (NV_WAIT_ON_BIT_LOCK_ARGUMENT_COUNT == 4)
static __sched int uvm_bit_wait(void *word)
{
- if (signal_pending_state(current->state, current))
+ if (signal_pending_state(current->__state, current))
return 1;
schedule();
return 0;

diff -u nvidia-tesla-460-460.91.03{,.old}/nvidia-drm/nvidia-drm-format.c
--- nvidia-tesla-460-460.91.03/nvidia-drm/nvidia-drm-format.c 2021-07-02 14:07:31.000000000 +0800
+++ nvidia-tesla-460-460.91.03.old/nvidia-drm/nvidia-drm-format.c 2022-05-15 15:17:23.498152286 +0800
@@ -29,6 +29,7 @@
#endif
#include <linux/kernel.h>

+#include "nvidia-uvm/uvm_linux.h"
#include "nvidia-drm-format.h"
#include "nvidia-drm-os-interface.h"

diff -u nvidia-tesla-460-460.91.03/common/inc/nv-procfs.h nvidia-tesla-460-460.91.03.old/common/inc/nv-procfs.h
--- nvidia-tesla-460-460.91.03/common/inc/nv-procfs.h 2021-07-02 14:07:32.000000000 +0800
+++ nvidia-tesla-460-460.91.03.old/common/inc/nv-procfs.h 2022-05-15 15:52:20.475063183 +0800
@@ -11,6 +11,11 @@
#define _NV_PROCFS_H

#include "conftest.h"
+#include <linux/version.h>
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(5,17,0))
+#define NV_PDE_DATA_PRESENT
+#define PDE_DATA(inode) pde_data(inode)
+#endif

#ifdef CONFIG_PROC_FS
#include <linux/proc_fs.h>
diff --git a/common/inc/nv-time.h b/common/inc/nv-time.h
index dc80806..cc343a5 100644
--- a/common/inc/nv-time.h
+++ b/common/inc/nv-time.h
@@ -23,6 +23,7 @@
#ifndef __NV_TIME_H__
#define __NV_TIME_H__

+#include <linux/version.h>
#include "conftest.h"
#include <linux/sched.h>
#include <linux/delay.h>
@@ -205,7 +206,12 @@ static inline NV_STATUS nv_sleep_ms(unsigned int ms)
// the requested timeout has expired, loop until less
// than a jiffie of the desired delay remains.
//
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(5, 14, 0))
current->state = TASK_INTERRUPTIBLE;
+#else
+ // Rel. commit "sched: Change task_struct::state" (Peter Zijlstra, Jun 11 2021)
+ WRITE_ONCE(current->__state, TASK_INTERRUPTIBLE);
+#endif
do
{
schedule_timeout(jiffies);

diff --git a/nvidia-drm/nvidia-drm-drv.c b/nvidia-drm/nvidia-drm-drv.c
index 84d4479..99ea552 100644
--- a/nvidia-drm/nvidia-drm-drv.c
+++ b/nvidia-drm/nvidia-drm-drv.c
@@ -20,6 +20,7 @@
* DEALINGS IN THE SOFTWARE.
*/

+#include <linux/version.h>
#include "nvidia-drm-conftest.h" /* NV_DRM_AVAILABLE and NV_DRM_DRM_GEM_H_PRESENT */

#include "nvidia-drm-priv.h"
@@ -903,9 +904,12 @@ static void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)

dev->dev_private = nv_dev;
nv_dev->dev = dev;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(5, 14, 0))
+ // Rel. commit "drm: Remove pdev field from struct drm_device" (Thomas Zimmermann, 3 May 2021)
if (device->bus == &pci_bus_type) {
dev->pdev = to_pci_dev(device);
}
+#endif

/* Register DRM device to DRM sub-system */

安装 PyCuda

Nvidia Docker安装

nvidia-docker

  • 这里按照上面链接的文档安装如下.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    # If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
    docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
    sudo apt-get purge -y nvidia-docker

    # Add the package repositories
    curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
    sudo apt-key add -
    distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
    curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
    sudo tee /etc/apt/sources.list.d/nvidia-docker.list
    sudo apt-get update

    # Install nvidia-docker2 and reload the Docker daemon configuration
    sudo apt-get install -y nvidia-docker2
    sudo pkill -SIGHUP dockerd # 如果之前有安装过Docker,这一步很重要,停掉之前的进程.

Docker安装Tensorflow GPU

1
2
3
4
5
6
7
8
9
~$ docker pull tensorflow/tensorflow:latest-gpu \
~$ nvidia-docker run -it --rm tensorflow/tensorflow:latest-gpu python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
2018-11-23 03:45:33.118841: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-11-23 03:45:33.196995: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node,so returning NUMA node zero
2018-11-23 03:45:33.197568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: Quadro P600 major: 6 minor: 1 memoryClockRate(GHz): 1.5565
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.57GiB
[...]

Nvidia GPU CLOUD容器云

安装与运行Caffe2容器,

1
2
3
4
5
6
~$ docker run --runtime=nvidia -it caffe2ai/caffe2:latest python -m caffe2.python.operator_test.relu_op_test
Trying example: test_relu(self=<__main__.TestRelu testMethod=test_relu>, X=array([[[-0.42894608],
[-0.65820682],
[ 0.39978197],
[...]

  • 从 Nvidia GPU CLOUD (NGC) 上安装

    1
    2
    3
    ~$ docker pull nvcr.io/nvidia/caffe2:18.08-py3
    # 运行测试.
    ~$ nvidia-docker run --runtime=nvidia -it nvcr.io/nvidia/caffe2:18.08-py3 python -m caffe2.python.operator_test.relu_op_test
  • 下面在Docker里运行jupyter notebook的实例,把它在docker里的8888端口映射到宿主机的9999端口,通过宿主机的浏览器,输入http://127.0.0.1:9999/可以访问到它.

  • --rm 当容器关闭后删除它

  • --it 以交互模式运行

  • -v 映射宿主机的目录到容器里.如上文就是把宿主机的/data/AI-DIR/TensorFlow/jupyter-notebook,映射到容器里的/data/jupyter目录.

1
2
3
4
5
6
7
8
9
10
11
~$ nvidia-docker run --rm  --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -v /data/AI-DIR/TensorFlow/jupyter-notebook:/data/jupyter  -it-p 9999:8888 nvcr.io/nvidia/caffe2:18.08-py3  sh -c "jupyter notebook --no-browser --allow-root --ip 0.0.0.0 /data/jupyter"

============
== Caffe2 ==
============

NVIDIA Release 18.08 (build 599137)

Container image Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
[...]

安装PyTorch

1
~$ docker pull nvcr.io/nvidia/pytorch:18.11-py3

从源码安装PyTorch

1
2
3
4
5
6
7
8
~$ pyenv activate  py3dev  # 通过 pyenv 进入 Python 3.6的虚拟环境.
~$ pip install numpy pyyaml mkl mkl-include setuptools cmake cffi typing # py3dev 环境中安装依赖库.
~$ export PATH=/usr/local/cuda-10.0/bin:$PATH
~$ export CUDA=1
~$ pip install pycuda
~$ git clone --recursive https://github.com/pytorch/pytorch
~$ cd pytorch
~$ python setup.py install
  • 如果在使用numpy中出现libmkl_rt.so: cannot open shared object file: No such file or directory错误,要安装libmkl_rt再重新安装numpy的库.
  • 现在caffe2已经进入到PyTorch源码里如果导入下面模块时,出现**ModuleNotFoundError: No module named ‘past’**错误,要先安装依赖pip install future
1
2
3
4
5
6
7
import matplotlib.pyplot as plt
import numpy as np
import os
import shutil
import caffe2.python.predictor.predictor_exporter as pe
from caffe2.python import core,model_helper,net_drawer,workspace,visualize,brew
ModuleNotFoundError: No module named 'past'
  • 警告net_drawer will not run correctly. Please install the correct dependencies.,该警告是因没有安装pydot.安装pip install pydot.

源码安装TensorFlow (支持 CUDA 10)

安装Bazel

1
2
3
4
~$ echo 'deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8' | sudo tee /etc/apt/sources.list.d/bazel.list
~$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
~$ sudo apt-get update
~$ sudo apt-get install bazel
  • 因为墙的原因,可能上述安装会很慢,可以直接从https://github.com/bazelbuild/bazel/releases一个安装脚本.现在如果使用apt-get install bazel会安装最新的0.20.0版本,但是现在的TensorFlow 1.12.0只支持bazel 0.19.2的版本编译.
1
2
3
4
5
6
7
$ bazel version
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Build label: 0.19.2
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Mon Nov 19 16:25:09 2018 (1542644709)
Build timestamp: 1542644709
Build timestamp as int: 1542644709

下载TensorFlow源码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
~$ export PATH=/usr/local/cuda/bin:$PATH
~$ export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:/usr/local/cuda-10.0/extras/CUPTI/lib64:$LD_LIBRARY_PATH
~$ git clone https://github.com/tensorflow/tensorflow.git
~$ pyenv activate py3dev
~$ pip install wheel
~$ ./configure
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.19.2 installed.
Please specify the location of python. [Default is fullpath/.pyenv/versions/py3dev/bin/python]:

Found possible Python library paths:
/fullpath/.pyenv/versions/py3dev/lib/python3.6/site-packages
Please input the desired Python library path to use. Default is [/fullpath/.pyenv/versions/py3dev/lib/python3.6/site-packages]
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: Y
XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: N
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 10.0
Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]:
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Do you wish to build TensorFlow with TensorRT support? [y/N]: N
No TensorRT support will be enabled for TensorFlow.

Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]:

Do you want to use clang as CUDA compiler? [y/N]: N
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]: N
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: y
Please specify the home path of the Android NDK to use. [Default is /fullpath/Android/Sdk/ndk-bundle]:
Please specify the home path of the Android SDK to use. [Default is /fullpath/Android/Sdk]:
Please specify the Android SDK API level to use. [Available levels: ['13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28']] [Default is 28]:

Please specify an Android build tools version to use. [Available versions: ['21.1.2', '23.0.3', '24.0.3', '25.0.0', '25.0.2', '25.0.3', '26.0.2', '27.0.0', '27.0.3', '28.0.0-rc2', '28.0.2', '28.0.3']] [Default is 28.0.3]:

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
--config=gdr # Build with GDR support.
--config=verbs # Build with libverbs support.
--config=ngraph # Build with Intel nGraph support.
--config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
--config=noaws # Disable AWS S3 filesystem support.
--config=nogcp # Disable GCP support.
--config=nohdfs # Disable HDFS support.
--config=noignite # Disable Apacha Ignite support.
--config=nokafka # Disable Apache Kafka support.
--config=nonccl # Disable NVIDIA NCCL support.
Configuration finished

~$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

~$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg # 编译Python安装包.
Wed Dec 5 10:54:27 CST 2018 : === Preparing sources in dir: /tmp/tmp.OZdwkuc2YO
~/github/tensorflow ~/github/tensorflow
[...]

~$ pip install /tmp/tensorflow_pkg/tensorflow-1.12.0rc0-cp36-cp36m-linux_x86_64.whl # 安装Python
~$ LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda-10.0/extras/CUPTI/lib64:$LD_LIBRARY_PATH jupyter notebook # 在notebook里测试运行tensorflow
  • 如果出现下面failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error这个错误,要在终端里先运行apt-get install nvidia-modprobe这个命令,并且重启系统.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
In [1]: import tensorflow as tf
In [2]: tf.test.is_built_with_cuda()
Out[2]: True
In [3]: tf.test.is_gpu_available(cuda_only=False,min_cuda_compute_capability=None)
2018-12-05 12:03:06.128401: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2018-12-05 12:03:06.128442: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:161] retrieving CUDA diagnostic information for host: debian
2018-12-05 12:03:06.128448: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:168] hostname: debian
2018-12-05 12:03:06.128470: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:192] libcuda reported version is: 410.48.0
2018-12-05 12:03:06.128488: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:196] kernel reported version is: 410.48.0
2018-12-05 12:03:06.128493: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:303] kernel version seems to match DSO: 410.48.0
Out[3]: False

In [4]: tf.test.is_gpu_available(cuda_only=False,min_cuda_compute_capability=None)
Out[4]: False

# 重启系统之后,可以正常使用GPU了.

In [1]: import tensorflow as tf

In [2]: tf.Session().list_devices()
2018-12-05 15:02:22.981018: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-05 15:02:22.982813: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55d255632230 executing computations on platform CUDA. Devices:
2018-12-05 15:02:22.982835: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): Quadro P600, Compute Capability 6.1
2018-12-05 15:02:22.983889: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1431] Found device 0 with properties:
name: Quadro P600 major: 6 minor: 1 memoryClockRate(GHz): 1.5565
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.74GiB
2018-12-05 15:02:22.983931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Adding visible gpu devices: 0
2018-12-05 15:02:22.986678: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-05 15:02:22.986711: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-12-05 15:02:22.986726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-12-05 15:02:22.986953: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1113] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1560 MB memory) -> physical GPU (device: 0, name: Quadro P600, pci bus id: 0000:01:00.0, compute capability: 6.1)
Out[2]:
[_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456, 4411150611837152607),
_DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 8331037032149977949),
_DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_GPU:0, XLA_GPU, 17179869184, 1279689307458374322),
_DeviceAttributes(/job:localhost/replica:0/task:0/device:GPU:0, GPU, 1636106240, 7170667474598106347)]

Out[3]: tf.test.is_gpu_available(cuda_only=False,min_cuda_compute_capability=None)
2018-12-05 15:05:52.037618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Adding visible gpu devices: 0
2018-12-05 15:05:52.037647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-05 15:05:52.037652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-12-05 15:05:52.037656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-12-05 15:05:52.037737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1113] Created TensorFlow device (/device:GPU:0 with 1560 MB memory) -> physical GPU (device: 0, name: Quadro P600, pci bus id: 0000:01:00.0, compute capability: 6.1)
Out[4]: True

运行TensorBoard可视化前端

1
2
3
4
5
6
7
import tensorflow as tf
input1 = tf.constant([1.0,2.0,3.0],name='input1')
input2 = tf.constant([2.0,3.0,4.0],name='input2')
output = tf.add_n([input1,input2],name='add')
with tf.Session() as sess:
writer = tf.summary.FileWriter(graph=sess.graph,logdir='./graph')
sess.run(output)
  • 运行上面示例代码片断,打开终端运行tensorboard --logdir='./graph' --port=6006,它的 WEB 服务器运行之后,可以通过浏览器访问可视端了.

Tensorflow使用笔记

TFRecord读写

1
2
3
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# 定义函数转化变量类型.
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

# mnist/data 是存放从网上下载的mnist数据位置
mnist = input_data.read_data_sets('./mnist/data',dtype=tf.uint8,one_hot=True)
images = mnist.train.images
labels = mnist.train.labels

pixels = images.shape[1]
num_examples = mnist.train.num_examples

filename = './mnist/output.tfrecords'

# 将数据转化为tf.train.Example格式.
def _make_example(pixels, label, image):
image_raw = image.tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'pixels': _int64_feature(pixels),
'label': _int64_feature(np.argmax(label)),
'image_raw': _bytes_feature(image_raw)
}))
return example

with tf.python_io.TFRecordWriter(filename) as writer:
for index in range(num_examples):
example = _make_example(pixels,labels[index],images[index])
writer.write(example.SerializeToString())
print('写入TFRecord测试文件')
  • 用同样的格式读取TFRecord文件记录.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    reader = tf.TFRecordReader()
    filename_queue = tf.train.string_input_producer(['./mnist/output.tfrecords'])
    _,serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(
    serialized_example,
    features={
    'pixels': tf.FixedLenFeature([],tf.int64),
    'label':tf.FixedLenFeature([],tf.int64),
    'image_raw': tf.FixedLenFeature([],tf.string),
    })

    images = tf.decode_raw(features['image_raw'],tf.uint8)
    labels = tf.cast(features['label'],tf.int32)
    pixels = tf.cast(features['pixels'],tf.int32)


    with tf.Session() as sess:
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)
    for i in range(10):
    image,label,pixel = sess.run([images,labels,pixels])
    coord.request_stop()
    coord.join(threads)

读取原始图片

1
2
3
4
5
6
7
8
9
10
11
12
import matplotlib.pyplot as plt

image_raw_data = tf.gfile.FastGFile('../img3.png','rb').read()
with tf.Session() as sess:
img_data = tf.image.decode_png(image_raw_data)
# 输出解码之后的三维矩阵.
print(img_data.eval())
plt.imshow(img_data.eval())
plt.show()
img_data.set_shape([420,420,3])
print(img_data.get_shape())

  • 调整图片的尺寸
    Method 取值 调整算法
    0 双线性插值法(Bilinear interploation)

| 1 | 最近邻居法(Nearest neighbor interpolation) |
| 2 | 双三次插值法(Bicubic interpolation) |
| 3 | 面积插值法(Area interpolation) |

1
2
3
4
5
6
7
with tf.Session() as sess:
# 如果直接以0-255范围的整数数据输入resize_images,那么输出将是0-255之间的实数,
# 不利于后续处理.本书建议在调整图片大小前,先将图片转为0-1范围的实数.
image_float = tf.image.convert_image_dtype(img_data,tf.float32)
resized = tf.image.resize_images(image_float,[400,400],method=0)
plt.imshow(resized.eval())
plt.show()
  • 裁剪与填充图片

    1
    2
    3
    4
    5
    6
    7
    with tf.Session() as sess:
    croped = tf.image.resize_image_with_crop_or_pad(img_data,300,300)
    padded = tf.image.resize_image_with_crop_or_pad(img_data,520,520)
    plt.imshow(croped.eval())
    plt.show()
    plt.imshow(padded.eval())
    plt.show()
  • 截取中心%50区域

    1
    2
    3
    4
    with tf.Session() as sess:
    central_cropped = tf.image.central_crop(img_data, 0.5)
    plt.imshow(central_cropped.eval())
    plt.show()

安装使用Keras

  • Keras是提供一些高可用的Python API,能帮助你快速的构建和训练自己的深度学习模型,它的后端是TensorFlow或者Theano.它很简约, 模块化的方法使建立并运行神经网络变得轻巧.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    In [1]: import keras
    Using TensorFlow backend.

    In [2]: keras.__version__
    Out[2]: '2.2.4'

    In [3]: !cat /home/lcy/.keras/keras.json
    {
    "floatx": "float32",
    "epsilon": 1e-07,
    "backend": "tensorflow",
    "image_data_format": "channels_last"
    }

Keras MNIST手写数据测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import keras
from keras.models import Sequential,load_model
from keras.layers import Dense,Dropout,Conv2D,Flatten,MaxPooling2D,Activation
from keras.datasets.mnist import load_data
import os
# 清除GPU的会话数据
keras.backend.clear_session()
(X_train,Y_train),(x_test,y_test) = load_data()
X_train = X_train.reshape(X_train.shape[0],28,28,1)
x_test = x_test.reshape(x_test.shape[0],28,28,1)
input_shape=(28,28,1)
X_train = X_train.astype('float32')
x_test = x_test.astype('float32')
X_train /= 255
x_test /= 255
print('x_train shape:',X_train.shape)
print('Number of images in x_train',X_train.shape[0])
print('Number of images in x_test',x_test.shape[0])

# 卷积网络
model = Sequential()
model.add(Conv2D(28,kernel_size=(3,3),input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.02))
model.add(Dense(10))
model.add(Activation('softmax'))

# 编译多分类函数分类器
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(x=X_train,y=Y_train,epochs=10)

# 保存训练模型到HDF5文件里
history = model.fit(X_train,Y_train,batch_size=128,epochs=20,verbose=2,validation_data=(x_test,y_test))
save_dir = './results/'
mode_name='keras_mnist.h5'
mode_path = os.path.join(save_dir,mode_name)
model.save(mode_path)
print('Saved trained model at %s' % mode_path)

# 打印它的训练图形
fig = plt.figure()
plt.subplot(2,1,1)
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='lower right')

plt.subplot(2,1,2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')

plt.tight_layout()
fig

# 使用一部分测试图片验证模型
mnist_model = load_model('./results/keras_mnist.h5')
loss_and_metrics = mnist_model.evaluate(x_test,y_test,verbose=2)
print('Test Loss',loss_and_metrics[0])
print('Test Accuracy',loss_and_metrics[1])
predicted_classes =mnist_model.predict_classes(x_test)
corrent_indices = np.nonzero(predicted_classes == y_test)[0]
incorrent_indices = np.nonzero(predicted_classes != y_test)[0]
print()
print(len(corrent_indices),' classifed correctly')
print(len(incorrent_indices),' classified incorrectly')

plt.rcParams['figure.figsize'] = (7,14)
figure_evaluation = plt.figure()

# 打印9个正确预测的图片
for i ,correct in enumerate(corrent_indices[:9]):
plt.subplot(6,3,i+1)
plt.imshow(x_test[correct].reshape(28,28),cmap='gray',interpolation='none')
plt.title('Predicted: {},Trutch: {}'.format(predicted_classes[correct],y_test[correct]))
plt.xticks([])
plt.yticks([])
# 打印9个错误预测的图片
for i ,correct in enumerate(incorrent_indices[:9]):
plt.subplot(6,3,i+1)
plt.imshow(x_test[correct].reshape(28,28),cmap='gray',interpolation='none')
plt.title('Predicted: {},Truth: {}'.format(predicted_classes[correct],y_test[correct]))
plt.xticks([])
plt.yticks([])
figure_evaluation


# 读入自己的手动生成的图片做测试

from PIL import Image
from keras_preprocessing.image import img_to_array
from keras_applications import imagenet_utils
data_dir = '/data/AI-DIR/TensorFlow/mnist/test-data/'
list_dir = os.listdir(data_dir)
print(len(list_dir))
image_height = 28
image_width = 28
channels = 1
img_data = np.ndarray(shape=(len(list_dir),image_height,image_width,channels))
label_data = np.zeros(len(list_dir),dtype='uint8')
i = 0
for file in list_dir:
# 读取目录下的图片,并把它转换成灰度图片.
png = Image.open(os.path.join(data_dir,file),'r').convert('L')
gray = png.point(lambda x: 0 if x == 255 else 255)
image = img_to_array(gray)
img_data[i] = image
label_data[i] = int(file[4])
# print('png file: ',file)
i+=1

print('test_data len',len(img_data))
print('test_data shape',img_data.shape)
print('test label ',label_data)

loss_and_metrics = mnist_model.evaluate(img_data,label_data,verbose=2)
print('Test Loss',loss_and_metrics[0])
print('Test Accuracy',loss_and_metrics[1])

predicted_classes =mnist_model.predict_classes(img_data)
corrent_indices = np.nonzero(predicted_classes == label_data)[0]
incorrent_indices = np.nonzero(predicted_classes != label_data)[0]
print()
print(len(corrent_indices),' classifed correctly')
print(len(incorrent_indices),' classified incorrectly')
plt.rcParams['figure.figsize'] = (7,14)
figure_evaluation = plt.figure()

# 打印9个错误预测的图片
for i ,correct in enumerate(incorrent_indices[:9]):
plt.subplot(6,3,i+1)
plt.imshow(img_data[correct].reshape(28,28),cmap='gray',interpolation='none')
plt.title('Predicted: {},Truth: {}'.format(predicted_classes[correct],label_data[correct]))
plt.xticks([])
plt.yticks([])

figure_evaluation

错误

导入tkinter模块

  • 提示无法导入tkinter模块,这里安装可能就比较麻烦.错误如下:
    1
    2
    3
    4
    5
    6
    7
    8

    In [4]: import matplotlib.pyplot as plt
    ---------------------------------------------------------------------------
    ModuleNotFoundError Traceback (most recent call last)
    <ipython-input-4-a0d2faabd9e9> in <module>()
    ----> 1 import matplotlib.pyplot as plt
    [...]
    ModuleNotFoundError: No module named '_tkinter'
  • 解块方法如下:
    1
    2
    3
    4
    5
    6
    ~$ apt-get install tk-dev
    ~$ pyenv uninstall 3.6.6
    ~$ pyenv install 3.6.6
    ~$ pyenv virtualenv py3dev
    ~$ pyenv activate py3dev
    ~$ python -m tkinter # 测试模块.

导入ggplot模块

  • from ggplot import *语句导入 ggplot 包时报错如下:
1
2
3
4
5
6
7
8
9
~/.pyenv/versions/3.6.6/envs/py3dev/lib/python3.6/site-packages/ggplot/stats/smoothers.py in <module>
2 unicode_literals)
3 import numpy as np
----> 4 from pandas.lib import Timestamp
5 import pandas as pd
6 import statsmodels.api as sm

ImportError: cannot import name 'Timestamp'

  • 解决方法:编辑文件.../site-packages/ggplot/stats/smoothers.py.把原来的from pandas.lib import Timestamp改成from pandas import Timestamp,保存 OK.

安装Kaggle API

  • Kaggle API
  • www.kaggle.com上面注册一个帐号,进入到帐号管理页面,在API一栏会有两个按钮Create New API Token,Expire API Token,点击Create New API Token浏览器就会自动下载一个名为kaggle.json的文件,并且会有 Toast 提示Ensure kaggle.json is in the location ~/.kaggle/kaggle.json to use the API.
1
2
3
4
~$ pip install kaggle
~$ mkdir ~/.kaggle
~$ mv ~/Download/kaggle.json ~/.kaggle/
~$ chmod 600 ~/.kaggle/kaggle.json

下载数据

  • 进入到https://www.kaggle.com/competitions页面,选择一行具体的项目进去,在页面底部有一个必须接受的对话框,I Understand and Accept,否则不能下载该项目的数据.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 下载数据到指定目录.
~$ kaggle competitions download -c traveling-santa-2018-prime-paths -p /fullpath/Traveling-Santa-2018-Prime-Paths/
~$ kaggle competitions list
ref deadline category reward teamCount userHasEntered
--------------------------------------------- ------------------- --------------- --------- --------- --------------
digit-recognizer 2030-01-01 00:00:00 Getting Started Knowledge 2708 True
titanic 2030-01-01 00:00:00 Getting Started Knowledge 10578 False
house-prices-advanced-regression-techniques 2030-01-01 00:00:00 Getting Started Knowledge 4519 False
imagenet-object-localization-challenge 2029-12-31 07:00:00 Research Knowledge 30 False
competitive-data-science-predict-future-sales 2019-12-31 23:59:00 Playground Kudos 1869 False
histopathologic-cancer-detection 2019-03-31 23:59:00 Playground Knowledge 140 False
humpback-whale-identification 2019-02-28 23:59:00 Featured $25,000 144 False
elo-merchant-category-recommendation 2019-02-26 23:59:00 Featured $50,000 630 False
ga-customer-revenue-prediction 2019-02-15 23:59:00 Featured $45,000 1104 False
quora-insincere-questions-classification 2019-02-05 23:59:00 Featured $25,000 1666 False
pubg-finish-placement-prediction 2019-01-30 23:59:00 Playground Swag 857 False
human-protein-atlas-image-classification 2019-01-10 23:59:00 Featured $37,000 1388 False
traveling-santa-2018-prime-paths 2019-01-10 23:59:00 Featured $25,000 958 True
[...]

使用FFmpeg支持Cuda硬件编解码

  • Install necessary packages.
1
2
~$ sudo apt-get install nvidia-cuda-toolkit nvidia-cuda-toolkit-gcc  yasm cmake libtool \
libc6 libc6-dev unzip wget libnuma1 libnuma-dev libnvidia-encode1
1
2
~$ git clone https://git.videolan.org/git/ffmpeg/nv-codec-headers.git // or mirror https://github.com/FFmpeg/nv-codec-headers
~$ cd nv-codec-headers && sudo make install
  • Clone FFmpeg’s public GIT repository.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
~$ git clone https://git.ffmpeg.org/ffmpeg.git ffmpeg/

~$ cd ffmpeg
~$ ./configure --enable-nonfree --enable-cuda-nvcc --enable-libnpp \
--enable-libmp3lame --enable-v4l2-m2m --enable-vdpau --enable-vaapi \
--enable-libdrm --enable-libx264 --enable-libvpx --enable-libwebp \
--enable-libv4l2 --enable-libopus --enable-libopencore-amrnb --enable-libopencore-amrwb \
--enable-librtmp --enable-gpl --enable-version3 --enable-libvorbis \
--disable-doc --disable-htmlpages --disable-manpages --disable-podpages \
--disable-txtpages --enable-shared

./configure --enable-nonfree --enable-amf --enable-libnpp \
--enable-libmp3lame --enable-v4l2-m2m --enable-vdpau --enable-vaapi \
--enable-libdrm --enable-libx264 --enable-libvpx --enable-libwebp \
--enable-libv4l2 --enable-libopus --enable-libopencore-amrnb --enable-libopencore-amrwb \
--enable-librtmp --enable-gpl --enable-version3 --enable-libvorbis \
--disable-doc --disable-htmlpages --disable-manpages --disable-podpages \
--disable-txtpages --disable-static --enable-shared

~$ make -j$(nproc) && sudo make install


~$ LD_LIBRARY_PATH=/usr/local/lib ffmpeg -hwaccels
ffmpeg version N-110065-g30cea1d39b Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 10 (Debian 10.2.1-6)
configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --disable-static --enable-shared
libavutil 58. 5.100 / 58. 5.100
libavcodec 60. 6.101 / 60. 6.101
libavformat 60. 4.100 / 60. 4.100
libavdevice 60. 2.100 / 60. 2.100
libavfilter 9. 4.100 / 9. 4.100
libswscale 7. 2.100 / 7. 2.100
libswresample 4. 11.100 / 4. 11.100
Hardware acceleration methods:
vdpau
cuda
vaapi

  • If your cuda installed into /usr/local/cuda. you need append following to the configure
1
2
3
~$ ./configure ....
--extra-cflags=-I/usr/local/cuda/include \
--extra-ldflags=-L/usr/local/cuda/lib64

运行错误

1
2
3
4
5
~$ ffmpeg  -hwaccel cuda -hwaccel_output_format cuda -f v4l2  -i /dev/video0  -c:a copy -c:v h264_nvenc -b:v 5M output.mp4 -y -loglevel debug
[h264_nvenc @ 0x55aacc4caf00] Driver does not support the required nvenc API version. Required: 12.0 Found: 11.1
[h264_nvenc @ 0x55aacc4caf00] The minimum required Nvidia driver for nvenc is 520.56.06 or newer
[h264_nvenc @ 0x55aacc4caf00] Nvenc unloaded

  • Reinstall nv-codec-headers for a suitable version of branch.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
~$ dpkg -l | grep "cuda"
ii cuda-keyring 1.0-1 all GPG keyring for the CUDA repository
ii libcuda1:amd64 470.161.03-1 amd64 NVIDIA CUDA Driver Library
ii libcudart11.0:amd64 11.2.152~11.2.2-3+deb11u3 amd64 NVIDIA CUDA Runtime Library
ii nvidia-cuda-dev:amd64 11.2.2-3+deb11u3 amd64 NVIDIA CUDA development files
ii nvidia-cuda-toolkit 11.2.2-3+deb11u3 amd64 NVIDIA CUDA development toolkit


~$ cd nv-codec-headers
~$ git tag
n10.0.26.0
n10.0.26.1
n10.0.26.2
n11.0.10.0
n11.0.10.1
n11.0.10.2
n11.1.5.0
n11.1.5.1
n11.1.5.2
n12.0.16.0
[...]
~$ git checkout n11.1.5.2
~$ sudo make install
  • And then reinstall ffmpeg again.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
In file included from libavutil/hwcontext_cuda.c:27:
libavutil/hwcontext_cuda.c: In function ‘cuda_context_init’:
libavutil/hwcontext_cuda.c:365:28: error: ‘CudaFunctions’ has no member named ‘cuCtxGetCurrent’; did you mean ‘cuCtxPopCurrent’?
365 | ret = CHECK_CU(cu->cuCtxGetCurrent(&hwctx->cuda_ctx));
| ^~~~~~~~~~~~~~~
libavutil/cuda_check.h:65:114: note: in definition of macro ‘FF_CUDA_CHECK_DL’
65 | #define FF_CUDA_CHECK_DL(avclass, cudl, x) ff_cuda_check(avclass, cudl->cuGetErrorName, cudl->cuGetErrorString, (x), #x)
| ^
libavutil/hwcontext_cuda.c:365:15: note: in expansion of macro ‘CHECK_CU’
365 | ret = CHECK_CU(cu->cuCtxGetCurrent(&hwctx->cuda_ctx));
| ^~~~~~~~
make: *** [ffbuild/common.mak:81: libavutil/hwcontext_cuda.o] Error 1
make: *** Waiting for unfinished jobs....
CC libavutil/hwcontext_vaapi.o
In file included from /usr/include/CL/cl.h:20,
from libavutil/hwcontext_opencl.h:25,
from libavutil/hwcontext_opencl.c:30:
/usr/include/CL/cl_version.h:22:9: note: ‘#pragma message: cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 300 (OpenCL 3.0)’
22 | #pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 300 (OpenCL 3.0)")
| ^~~~~~~
STRIP libswscale/x86/output.o

安装VA-API支持

1
2
~$ sudo apt-get install libnvcuvid1  libgstreamer-plugins-bad1.0-dev \
meson gstreamer1.0-plugins-bad libva-dev -y
  • To compile FFmpeg on Linux, do the following:
1
2
~$ git clone https://git.videolan.org/git/ffmpeg/nv-codec-headers.git
~$ cd nv-codec-headers && sudo make install
1
2
3
~$ git clone https://github.com/elFarto/nvidia-vaapi-driver
~$ cd nvidia-vaapi-driver && meson setup build
~$ meson install -c build
  • 运行
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
LIBGL_DEBUG=verbose
export LIBVA_DRIVER_NAME=nvidia
~$ vainfo
libva info: VA-API version 1.17.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.17 (libva 2.12.0)
vainfo: Driver version: VA-API NVDEC driver [egl backend]
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileH264Main : VAEntrypointVLD
VAProfileH264High : VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointVLD
VAProfileVP9Profile0 : VAEntrypointVLD
VAProfileHEVCMain10 : VAEntrypointVLD
VAProfileHEVCMain12 : VAEntrypointVLD
VAProfileVP9Profile2 : VAEntrypointVLD
  • vainfo错误详情输出
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
~$ NVD_LOG=1 vainfo
libva info: VA-API version 1.17.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so
214878.067671198 [2010296-2010296] ../src/vabackend.c: 108 init CUDA ERROR 'unknown error' (999)

libva info: Found init function __vaDriverInit_1_0
214878.067694762 [2010296-2010296] ../src/vabackend.c:1872 __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 0x55a1fedadf50 10
214878.067698198 [2010296-2010296] ../src/vabackend.c:1894 __vaDriverInit_1_0 Now have 0 (0 max) instances
214878.067700042 [2010296-2010296] ../src/vabackend.c:1916 __vaDriverInit_1_0 Selecting EGL backend
214878.071761148 [2010296-2010296] ../src/export-buf.c: 150 findGPUIndexFromFd Defaulting to CUDA GPU ID 0. Use NVD_GPU to select a specific CUDA GPU
214878.071770746 [2010296-2010296] ../src/export-buf.c: 163 findGPUIndexFromFd Looking for GPU index: 0
214878.073034516 [2010296-2010296] ../src/export-buf.c: 175 findGPUIndexFromFd Found 3 EGL devices
214878.074061472 [2010296-2010296] ../src/export-buf.c: 229 findGPUIndexFromFd No EGL_CUDA_DEVICE_NV support for EGLDevice 0
214878.074069096 [2010296-2010296] ../src/export-buf.c: 229 findGPUIndexFromFd No EGL_CUDA_DEVICE_NV support for EGLDevice 1
214878.074074135 [2010296-2010296] ../src/export-buf.c: 232 findGPUIndexFromFd No DRM device file for EGLDevice 2
214878.074076840 [2010296-2010296] ../src/export-buf.c: 235 findGPUIndexFromFd No match found, falling back to default device
214878.075083408 [2010296-2010296] ../src/export-buf.c: 289 egl_initExporter Driver supports 16-bit surfaces
214878.075096823 [2010296-2010296] ../src/vabackend.c:1948 __vaDriverInit_1_0 CUDA ERROR 'initialization error' (3)

214878.075100831 [2010296-2010296] ../src/export-buf.c: 65 egl_releaseExporter Releasing exporter, 0 outstanding frames
214878.075109497 [2010296-2010296] ../src/export-buf.c: 82 egl_releaseExporter Done releasing frames
libva error: /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so init failed
libva info: va_openDriver() returns 1
vaInitialize failed with error code 1 (operation failed),exit
  • 上面的错误是在hibernate之后出现,需要重加载nvidia_uvm内核驱动
1
2
sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm
  • 创建systemd服务处理hiberante后的调设置与模块重载
1
2
3
4
5
6
7
8
~$ cat /etc/pm/sleep.d/after-hibernate.sh
#!/bin/bash
# on bookworm will get vaInitialize failed with error code 1 after hibernate.
rmmod nvidia_uvm
modprobe nvidia_uvm

exit 0

  • systemd service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
~$ cat /etc/systemd/system/rfh.service
[Unit]
Description=Run script after hibernate recovery
#After=suspend.target
After=hibernate.target
#After=hybrid-sleep.target
[Service]
ExecStart=/etc/pm/sleep.d/after-hibernate.sh
[Install]
#WantedBy=suspend.target
WantedBy=hibernate.target
#WantedBy=hybrid-sleep.target

~$ systemctl enable rfh

Gstreamer使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
export LIBVA_DRIVER_NAME=nvidia
export GST_VAAPI_ALL_DRIVERS=1

~$ gst-inspect-1.0 vaapi
Plugin Details:
Name vaapi
Description VA-API based elements
Filename /lib/x86_64-linux-gnu/gstreamer-1.0/libgstvaapi.so
Version 1.22.0
License LGPL
Source module gstreamer-vaapi
Documentation https://gstreamer.freedesktop.org/documentation/vaapi/
Source release date 2023-01-23
Binary package gstreamer-vaapi
Origin URL https://tracker.debian.org/pkg/gstreamer-vaapi

vaapidecodebin: VA-API Decode Bin
vaapih264dec: VA-API H264 decoder
vaapih265dec: VA-API H265 decoder
vaapimpeg2dec: VA-API MPEG2 decoder
vaapisink: VA-API sink
vaapivc1dec: VA-API VC1 decoder
vaapivp9dec: VA-API VP9 decoder

7 features:
+-- 7 elements

  • 如上面所示,只持上面的解码,并且nvidia-vaapi-driver还不支持vaapipostproc: VA-API video postprocessing,因此还无法使用vaapisink硬件解码播放。

  • 1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    GST_DEBUG=nvdec*:6,nvenc*:6 gst-inspect-1.0 nvdec
    ~$ gst-inspect-1.0 nvcodec
    Plugin Details:
    Name nvcodec
    Description GStreamer NVCODEC plugin
    Filename /lib/x86_64-linux-gnu/gstreamer-1.0/libgstnvcodec.so
    Version 1.22.0
    License LGPL
    Source module gst-plugins-bad
    Documentation https://gstreamer.freedesktop.org/documentation/nvcodec/
    Source release date 2023-01-23
    Binary package GStreamer Bad Plugins (Debian)
    Origin URL https://tracker.debian.org/pkg/gst-plugins-bad1.0

    cudaconvert: CUDA colorspace converter
    cudaconvertscale: CUDA colorspace converter and scaler
    cudadownload: CUDA downloader
    cudascale: CUDA video scaler
    cudaupload: CUDA uploader
    nvh264dec: NVDEC h264 Video Decoder
    nvh264sldec: NVDEC H.264 Stateless Decoder
    nvh265dec: NVDEC h265 Video Decoder
    nvh265sldec: NVDEC H.265 Stateless Decoder
    nvjpegdec: NVDEC jpeg Video Decoder
    nvmpeg2videodec: NVDEC mpeg2video Video Decoder
    nvmpeg4videodec: NVDEC mpeg4video Video Decoder
    nvmpegvideodec: NVDEC mpegvideo Video Decoder
    nvvp9dec: NVDEC vp9 Video Decoder
    nvvp9sldec: NVDEC VP9 Stateless Decoder
  • 测式硬解播放h264文件

1
~$ sudo apt-get install gstreamer1.0-plugins-base-apps
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

~$ gst-discoverer-1.0 test.x264.AAC5.1.mp4
Done discovering test.x264.AAC5.1.mp4
Missing plugins
(gstreamer|1.0|gst-discoverer-1.0|GStreamer element vaapipostproc|element-vaapipostproc)

Properties:
Duration: 1:39:44.736000000
Seekable: yes
Live: no
container #0: Quicktime
video #1: H.264 (High Profile)
Stream ID: 1a5271d9ce1c168fa86e3c3727d54d189e469c400ed56a5836c778c4ddd01ac6/001
Width: 1920
Height: 1036
Depth: 24
Frame rate: 24000/1001
Pixel aspect ratio: 1/1
Interlaced: false
Bitrate: 2249690
Max bitrate: 31250000
audio #2: MPEG-4 AAC
Stream ID: 1a5271d9ce1c168fa86e3c3727d54d189e469c400ed56a5836c778c4ddd01ac6/002
Language: <unknown>
Channels: 6 (front-left, front-right, front-center, lfe1, rear-left, rear-right)
Sample rate: 48000
Depth: 32
Bitrate: 384000
Max bitrate: 384000

  • NVDEC H.264 Stateless Decoder解码,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
export LIBVA_DRIVER_NAME=nvidia
export GST_VAAPI_ALL_DRIVERS=1
export GST_VAAPI_DRM_DEVICE=/dev/dri/renderD128

~$ gst-launch-1.0 filesrc location=test.x264.AAC5.1.mp4 ! parsebin ! nvh264sldec ! videoconvert ! xvimagesink

~$ nvidia-smi
Sun May 7 22:40:07 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:07:00.0 On | N/A |
| N/A 49C P0 N/A / 30W | 726MiB / 2048MiB | 18% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 11453 G /usr/lib/xorg/Xorg 294MiB |
| 0 N/A N/A 12305 G ...e/michael/firefox/firefox 268MiB |
| 0 N/A N/A 93886 G ...AAAAAAAAA= --shared-files 1MiB |
| 0 N/A N/A 95739 G ...RendererForSitePerProcess 74MiB |
| 0 N/A N/A 119708 C gst-launch-1.0 82MiB |
+-----------------------------------------------------------------------------+

  • 使用libav h264 decoder, CPU会比上面的高10%左右。
1
~$ gst-launch-1.0 filesrc location=test4.mp4 ! parsebin ! avdec_h264 ! videoconvert ! xvimagesink
  • 使用glimagesink测试速度,fpsdisplaysink会显示当前的帧率。
1
2
~$ sudo apt-get install gstreamer1.0-gl
~$ GST_VAAPI_DRM_DEVICE=/dev/dri/renderD128 gst-launch-1.0 filesrc location=test4.mp4 ! parsebin ! nvh264sldec ! videoconvert ! fpsdisplaysink video-sink=glimagesink sync=false

错误记录分析

  • 下面的错误是在我本机已经存在/usr/local/include/va,并且它的版本或者比较低,没有这些结构体。但是在/usr/include/va目录下是系统安装了libva-dev所生成,里面的结构体是符合要求。meson可能先找到/usr/local/include/va,并且#include <va/va.h>就会忽略掉/usr/include/va/va.h.
1
2
3
4
5
6
7
8
9
10
11
~$ cc -Invidia_drv_video.so.p -I. -I.. -I../nvidia-include  -I/usr/local/include -I/usr/include -I/usr/include/libdrm -I/usr/include/gstreamer-1.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/x86_64-linux-gnu -fvisibility=hidden -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=gnu11 -g -Wno-missing-field-initializers -Wno-unused-parameter -Werror=format -Werror=init-self -Werror=int-conversion -Werror=missing-declarations -Werror=missing-prototypes -Werror=pointer-arith -Werror=undef -Werror=vla -Wsuggest-attribute=format -Wwrite-strings -fPIC -pthread -MD -MQ nvidia_drv_video.so.p/src_h264.c.o -MF nvidia_drv_video.so.p/src_h264.c.o.d -o nvidia_drv_video.so.p/src_h264.c.o -c ../src/h264.c^C
michael@debian:~/3TB-DISK/github/nvidia-driver/nvidia-vaapi-driver/build$ ^C
michael@debian:~/3TB-DISK/github/nvidia-driver/nvidia-vaapi-driver/build$ cc -Invidia_drv_video.so.p -I. -I.. -I../nvidia-include -I/usr/local/include -I/usr/include -I/usr/include/libdrm -I/usr/include/gstreamer-1.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -fvisibility=hidden -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=gnu11 -g -Wno-missing-field-initializers -Wno-unused-parameter -Werror=format -Werror=init-self -Werror=int-conversion -Werror=missing-declarations -Werror=missing-prototypes -Werror=pointer-arith -Werror=undef -Werror=vla -Wsuggest-attribute=format -Wwrite-strings -fPIC -pthread -MD -MQ nvidia_drv_video.so.p/src_h264.c.o -MF nvidia_drv_video.so.p/src_h264.c.o.d -o nvidia_drv_video.so.p/src_h264.c.o -c ../src/h264.c
In file included from ../src/h264.c:1:
../src/vabackend.h:123:77: error: unknown type name ‘VADRMPRIMESurfaceDescriptor’
123 | bool (*fillExportDescriptor)(struct _NVDriver *drv, NVSurface *surface, VADRMPRIMESurfaceDescriptor *desc);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/h264.c:133:1: warning: ‘retain’ attribute directive ignored [-Wattributes]
133 | const DECLARE_CODEC(h264Codec) = {
| ^~~~~

  • 下面这个也是因为在本地存在/usr/local/include/EGL所导致的。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
FAILED: nvidia_drv_video.so.p/src_export-buf.c.o
cc -Invidia_drv_video.so.p -I. -I.. -I../nvidia-include -I/usr/local/include -I/usr/include/libdrm -I/usr/include/gstreamer-1.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/x86_64-linux-gnu -fvisibility=hidden -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -g -Wno-missing-field-initializers -Wno-unused-parameter -Werror=format -Werror=incompatible-pointer-types -Werror=init-self -Werror=int-conversion -Werror=missing-declarations -Werror=missing-prototypes -Werror=pointer-arith -Werror=undef -Werror=vla -Wsuggest-attribute=format -Wwrite-strings -fPIC -pthread -MD -MQ nvidia_drv_video.so.p/src_export-buf.c.o -MF nvidia_drv_video.so.p/src_export-buf.c.o.d -o nvidia_drv_video.so.p/src_export-buf.c.o -c ../src/export-buf.c
../src/export-buf.c: In function ‘egl_initExporter’:
../src/export-buf.c:242:5: error: unknown type name ‘PFNEGLQUERYDMABUFFORMATSEXTPROC’; did you mean ‘PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC’?
242 | PFNEGLQUERYDMABUFFORMATSEXTPROC eglQueryDmaBufFormatsEXT = (PFNEGLQUERYDMABUFFORMATSEXTPROC) eglGetProcAddress("eglQueryDmaBufFormatsEXT");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC
../src/export-buf.c:242:65: error: ‘PFNEGLQUERYDMABUFFORMATSEXTPROC’ undeclared (first use in this function); did you mean ‘PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC’?
242 | PFNEGLQUERYDMABUFFORMATSEXTPROC eglQueryDmaBufFormatsEXT = (PFNEGLQUERYDMABUFFORMATSEXTPROC) eglGetProcAddress("eglQueryDmaBufFormatsEXT");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC
../src/export-buf.c:242:65: note: each undeclared identifier is reported only once for each function it appears in
../src/export-buf.c:242:98: error: expected ‘,’ or ‘;’ before ‘eglGetProcAddress’
242 | PFNEGLQUERYDMABUFFORMATSEXTPROC eglQueryDmaBufFormatsEXT = (PFNEGLQUERYDMABUFFORMATSEXTPROC) eglGetProcAddress("eglQueryDmaBufFormatsEXT");
| ^~~~~~~~~~~~~~~~~
../src/export-buf.c:265:9: error: called object ‘eglQueryDmaBufFormatsEXT’ is not a function or function pointer
265 | if (eglQueryDmaBufFormatsEXT(drv->eglDisplay, 64, formats, &formatCount)) {

  • Nvidia-drm错误
1
2
3
4
5
6
7
~$ dmesg
[...]
[38810.269044] [drm:drm_new_set_master [drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to grab modeset ownership
[41522.270711] [drm:drm_new_set_master [drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to grab modeset ownership
[42735.271307] [drm:drm_new_set_master [drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to grab modeset ownership
[44347.269266] [drm:drm_new_set_master [drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to grab modeset ownership

  • 上面错误是因为设置了options nvidia-drm modeset=1, 请确保在下面这些文件,或者/etc/modprobe.d,/usr/lib/modproble.d目录中的文件的,都不能有此设置。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
~$ grep --include=*.conf -rnw /usr/lib/  -e "nvidia-drm"
/usr/lib/modprobe.d/nvidia-installer-disable-nouveau.conf

~$ sudo grep --include=*.conf -rnw /etc/ -e "nvidia-drm"
/etc/nvidia/current/nvidia-modprobe.conf:2:options nvidia-drm modset=1
/etc/nvidia/current/nvidia-modprobe.conf:5:install nvidia-drm modprobe nvidia-modeset ; modprobe -i nvidia-current-drm $CMDLINE_OPTS
/etc/nvidia/current/nvidia-modprobe.conf:13:remove nvidia modprobe -r -i nvidia-drm nvidia-modeset nvidia-peermem nvidia-uvm nvidia
/etc/nvidia/current/nvidia-modprobe.conf:15:remove nvidia-modeset modprobe -r -i nvidia-drm nvidia-modeset
/etc/nvidia/current/nvidia-load.conf:1:nvidia-drm
/etc/nvidia/current/nvidia-drm-outputclass.conf:3:# nvidia-drm.ko kernel module. Please note that this only works on Linux kernels
/etc/nvidia/current/nvidia-drm-outputclass.conf:4:# version 3.9 or higher with CONFIG_DRM enabled, and only if the nvidia-drm.ko
/etc/nvidia/current/nvidia-drm-outputclass.conf:9: MatchDriver "nvidia-drm"
````

* `polkitd segfault`
* [interpreting-segfault-messages](https://stackoverflow.com/questions/2549214/interpreting-segfault-messages/2549363#2549363)


```sh
~$ dmesg
polkitd[99838]: segfault at 8 ip 0000564f56f95736 sp 00007ffe8b5fa800 error 4 in polkitd[564f56f91000+e000] likely on CPU 1 (core 1, socket 0)
  • 出现上面错误,有可能是与系统的库有兼容问题,或者重新也是不能覆盖旧的文件,需要执行下面三个步骤:
    • sudo apt-get remove -y policykit-1;
    • dpkg --purge policykit-1;
    • 再手动删除/etc/policykit-1,仔细确认上面的命令是否完全删除干净。
    • sudo apt-get install policykit-1 -y;

谢谢支持

  • 微信二维码:

  • 参考:

  • Atmel AVR系列是一种基于改进的哈佛结构、8位~32位精简指令集(Reduced Instruction Set Computing,RISC)的微控制器,由Atmel公司于1996年研发.AVR系列是首次采用闪存(Flash Memory)作为数据存储介质的单芯片微控制器之一,同时代的其它微控制器多采用一次写入可编程ROMEPROM或是EEPROM.目前AVR处理器发展了六个系列,分别是:tinyAVR,ATtiny系列;megaAVR,ATmega系列;XMEGA,ATxmega系列;Application-specific AVR,面向特殊应用的AVR系列,增加LCD控制器、USB控制器、PWM等特性;FPSLIC,FPGA上的AVR核;AVR32,32位AVR系列,包含SIMDDSP以及音视频处理特性,与ARM架构形成争.

ATmega32U4(Arduino pro micro)

  • sparkfun/Arduino_Boards

  • atmega-asm

  • MiniCore

  • Arduino-Based (ATmega32U4) Mouse and Keyboard Controller

  • The Lost Art of Structure Packing

  • 连接ICSP烧写bootloader.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
            (2232HIO)
    FT232H ATmega32U4

    pin13 ADBUS0 <------> SCK pin15
    pin14 ADBUS1 <------> MOSI pin16
    pin15 ADBUS2 <------> MISO pin14
    pin16 ADBUS3 <------> Reset RST
    GND <------> GND
    +3.3V <------> +3.3V

    ~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB0 -b 19200 -p atmega32u4 -U lfuse:r:-:i

    avrdude: AVR device initialized and ready to accept instructions

    Reading | ################################################## | 100% 0.01s

    avrdude: Device signature = 0x1e9587 (probably m32u4)
    avrdude: reading lfuse memory:

    Reading | ################################################## | 100% 0.00s

    avrdude: writing output file "<stdout>"
    :01000000FF00
    :00000001FF

    avrdude: safemode: Fuses OK (E:CB, H:D8, L:FF)

    avrdude done. Thank you.
  • some of valid programmers for FTDI

1
2
3
4
5
6
2232HIO          = FT2232H based generic programmer
4232h = FT4232H based generic programmer
ft232r = FT232R Synchronous BitBang
ft245r = FT245R Synchronous BitBang
ttl232r = FTDI TTL232R-5V with ICSP adapter
UM232H = FT232H based module from FTDI and Glyn.com.au
  • 添加sparkfun/Arduino_Boards,让Arduino IDE支持更多的种类的板子,添加URL后,通过Tools -> Boards Manager 安装SparkFun AVR Boards.烧写SparkFun bootloader.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB0 -b 19200 -p m32u4  -U flash:w:.arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9587 (probably m32u4)
avrdude: NOTE: "flash" memory has been specified, an erase cycle will be performed
To disable this feature, specify the -D option.
avrdude: erasing chip
avrdude: reading input file ".arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex"
avrdude: input file .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex auto detected as Intel Hex
avrdude: writing flash (32762 bytes):

Writing | ################################################## | 100% 0.00s

avrdude: 32762 bytes of flash written
avrdude: verifying flash memory against .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex:
avrdude: load data flash data from input file .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex:
avrdude: input file .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex auto detected as Intel Hex
avrdude: input file .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex contains 32762 bytes
avrdude: reading on-chip flash data:

Reading | ################################################## | 100% 0.00s

avrdude: verifying ...
avrdude: 32762 bytes of flash verified

avrdude: safemode: Fuses OK (E:CB, H:D8, L:FF)

avrdude done. Thank you.

ATmega328p (Arduino Pro mini with CH340)

  • 烧写好bootloader就可以使用USBArduino IDE上进行开发.
  • 选择Tools -> Board -> Arduino Pro or Pro Mini, 烧写器AVRISP mkii.

UART通信的要点

  • 不像其他的通讯协议,UART沒有clock信号可供参考,所以双方需要事先知道彼此的baud rate,才知道双方是以多快的速度传送数据,必须条件:

    • 两个硬件设备必须要共地(GND)
    • Baud rate必须相同。
  • Baud rate9600,每个bit的时间应该是1/9600秒,对于CPU来说是几个cycle呢?如果CPU频率是9600Hz,这样刚好就是1 cycle传输一个bit.

ATtiny85(CJMCU)

Bootloader

  • ATtiny85 USB Boot Loader: Details

  • micronucleus是一个可以支持跨平台的USB上传烧写的bootloader,体积在2kb以内.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11

    FT232H ATTiny85

    pin13 ADBUS0 <------> SCK PB2
    pin14 ADBUS1 <------> MOSI PB0
    pin15 ADBUS2 <------> MISO PB1
    pin16 ADBUS3 <------> Reset PB5
    GND <------> GND
    +5V <------> +5V

    ~ micronucleus/firmware/releases$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB1 -b 19200 -p attiny85 -U flash:w:t85_default.hex -U lfuse:w:0xe2:m -U hfuse:w:0xdd:m -U efuse:w:0xfe:m
  • 接入USB会发现如下的设备:

1
2
3
4
~$ lsusb
[...]
Bus 004 Device 007: ID 16d0:0753 MCS Digistump DigiSpark
[...]
  • 复制micronucleus/commandline/49-micronucleus.rules到系统的/etc/udev/rules.d/目录内.

系统控制与复位

  • 这里要讲到关于熔丝位(fuse)的技术.具体的技术细节需要查看Datasheet20. Memory Programming.可以根据文档去配置编程使能相应的字节的位.也可以通过这个https://www.engbedded.com/fusecalc/配置得出三个字节去配置熔丝位.lfuse表示低位,hfuse表示高位,efuse表示扩展位.
  • 在配置编程熔丝位时有几个问题要注意,比如:把SPIEN,JTAGEN的位设定为未编程状态,这将使芯片失去了JTAG与SPI接口的功能,不能重新烧写,从而以致单片机锁死,出现这种情况时就需要高压(12v)并行编程方式才能将单片机的功能恢复.另一个问题是要启动地址的错误,如果没有开启单片机的BOOTLOADER功能,就不要设置BOOTRST的编程位为0(已编程),否则单片机在上电时不是从Flash0x0000开始运行的,而是转到BOOT区执行,从而导致单片机无法正确运行.

使用Arduino IDE支持(ATTinyCore)

  • ATTinyCore是让最新的Arduino IDE支持ATTiny系列的单片机,安装流程当然也就是按照https://github.com/SpenceKonde/ATTinyCore/blob/master/Installation.md操作,在Arduino IDE -> File->Preferences加入http://drazzy.com/package_drazzy.com_index.json,并且更新安装ATTinyCore的库.之后在``Arduino IDE -> Tools -> Board -> ATTinyCore`里面可以选择目标的单片机.
  • 有可能ATTinyCore自带的micronucleus版本太低与Attiny85内烧写版本的不匹配,就会出现下面的错误,具体的版本可以查看~/.arduino15/packages/ATTinyCore/tools/micronucleus/2.0a4/.关于ATTinyCore所支持的硬件与固件配置可以查看~/.arduino15/packages/ATTinyCore/hardware/avr/1.4.1/bootloaders.
1
2
3
Warning: device with unknown new version of Micronucleus detected.
This tool doesn\'t know how to upload to this new device. Updates may be available.
Device reports version as: 2.4
  • 关于上面的警告提示,需要更新micronucleus版本.
    1
    2
    ~$ cd micronucleus/commandline && make
    ~$ cp micronucleus ~/.arduino15/packages/ATTinyCore/tools/micronucleus/2.0a4/
  • 选择主板:Tools -> Board -> ATTinyCore -> ATtiny85(Micronucleus/DigiSpark)

  • 烧写方式:Tools -> Burn Bootloader Method: "Upgrade (via USB)"

  • 基本上选择了正确的主板,其它参数默认就可以了,测试图如下:
    arduino-attinycore-t85.png

  • 测式程序是:File -> Examples -> Built-in examples -> 01.Basics -> Blink. 只是重定义了LED_BUILTIN的IO口,如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    #define LED_BUILTIN 1

    // the setup function runs once when you press reset or power the board
    void setup() {
    // initialize digital pin LED_BUILTIN as an output.
    pinMode(LED_BUILTIN, OUTPUT);
    }

    // the loop function runs over and over again forever
    void loop() {
    digitalWrite(LED_BUILTIN, HIGH); // turn the LED on (HIGH is the voltage level)
    delay(1000); // wait for a second
    digitalWrite(LED_BUILTIN, LOW); // turn the LED off by making the voltage LOW
    delay(1000); // wait for a second
    }
  • 烧写提示如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Plug in device now... (will timeout in 60 seconds)
> Please plug in the device ...
> Press CTRL+C to terminate the program.
> Device is found!
connecting: 16% complete
connecting: 22% complete
connecting: 28% complete
connecting: 33% complete
> Device has firmware version 2.4
> Device signature: 0x1e930b
> Available space for user applications: 6522 bytes
> Suggested sleep time between sending pages: 7ms
> Whole page count: 102 page size: 64
> Erase function sleep duration: 714ms
parsing: 50% complete
> Erasing the memory ...
erasing: 55% complete
erasing: 60% complete
erasing: 65% complete
> Starting to upload ...
writing: 70% complete
writing: 75% complete
writing: 80% complete
> Starting the user app ...
running: 100% complete
>> Micronucleus done. Thank you!

新增烧写器(UM232H为例)

  • Platform specification

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    ~$ tree -L 2 ~/.arduino15/packages/
    /home/michael/.arduino15/packages/
    ├── arduino
    │   ├── hardware
    │   └── tools
    ├── atmel-avr-xminis
    │   └── hardware
    ├── ATTinyCore
    │   ├── hardware
    │   └── tools
    ├── esp32
    │   ├── hardware
    │   └── tools
    ├── SparkFun
    │   └── hardware
    └── STM32
    ├── hardware
    └── tools
  • ~/.arduino15/packages/内的各种包内结构如下,基本每一个包(package: i.e: ardunion,ATTinyCore )下面的hardware\<arch>\<version\内都有boards.txt,programmers.txt文件.

  • 而在包下面的hardware\tools\内包内,包含这个包所支持的工具链,如:编译器,烧写器,还有一些特定的工具等.这里以avrdude为例,在Arduino IDE烧写AVR的板子时候,它会调用包内的avrdude与配置文件,如:

1
2
.arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/bin/avrdude \
-c .arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/etc/avrdude.conf
  • 下面是让ATTinyCore包内的Attiny85通过使用UM232H烧写器,在Arduino IDE内烧写,无需其它的bootloader支持.
  • 首先在~/.arduino15/packages/ATTinyCore/hardware/avr/1.4.1/programmers.txt内,加入以下内容:
1
2
3
4
5
6
7
8
um232h.name=UM232H as ISP
um232h.communication=serial
um232h.protocol=UM232H
um232h.speed=19200
um232h.program.protocol=UM232H
um232h.program.speed=19200
um232h.program.tool=avrdude
um232h.program.extra_params=-P{serial.port} -b{program.speed}
  • 重启Arduino IDE会发现,选择ATTinyCore包类的板子,在烧写器一栏,会看到UM232H as ISP. 如果在某个包类没有在定义programmers.txt,它就会使用目标板子在arduino体系内所对应~/.arduino15/packages/arduino/hardware/<arch>/<version>/programmers.txt

  • 如:

    1
    ~$ .arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/bin/avrdude -C .arduino15/packages/ATTinyCore/hardware/avr/1.4.1/avrdude.conf
  • 如果这里的.arduino15/packages/ATTinyCore/hardware/avr/1.4.1/avrdude.conf文件内没有支持UM232H配置,需要在avrdude.conf加入下面内容:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    # UM232H module from FTDI and Glyn.com.au.
    # See helix.air.net.au for detailed usage information.
    # J1: Connect pin 2 and 3 for USB power.
    # J2: Connect pin 2 and 3 for USB power.
    # J2: Pin 7 is SCK
    # : Pin 8 is MOSI
    # : Pin 9 is MISO
    # : Pin 11 is RST
    # : Pin 6 is ground
    # Use the -b flag to set the SPI clock rate eg -b 3750000 is the fastest I could get
    # a 16MHz Atmega1280 to program reliably. The 232H is conveniently 5V tolerant.
    programmer
    id = "UM232H";
    desc = "FT232H based module from FTDI and Glyn.com.au";
    type = "avrftdi";
    usbvid = 0x0403;
    # Note: This PID is reserved for generic 232H devices and
    # should be programmed into the EEPROM
    usbpid = 0x6014;
    usbdev = "A";
    usbvendor = "";
    usbproduct = "";
    usbsn = "";
    #ISP-signals
    sck = 0;
    mosi = 1;
    miso = 2;
    reset = 3;
    ;

  • 如在运行烧写时出现在下面错误,也就是在一些avrdude.conf内没有支持UM232H的原因之一.

    1
    avrdude: Error: no libftdi or libusb support. Install libftdi1/libusb-1.0 or libftdi/libusb and run configure/make again.
  • 这里比较简单解决办法是,使用系统的/usr/bin/avrdude来替换.arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/bin/avrdude

  • 为什么Arduino IDE为会调用包内的工具(avrdude),因为它的路径定义如下:

    1
    2
    3
    4
    5
    ~$ grep "avrdude.path" ~/.arduino15/packages/arduino/hardware/avr/1.8.3/platform.txt
    tools.avrdude.path={runtime.tools.avrdude.path}

    ~$ grep "avrdude.path" ~/.arduino15/packages/ATTinyCore/hardware/avr/1.4.1/platform.txt
    tools.avrdude.path={runtime.tools.avrdude.path}
  • 最后,新增其它的种类的烧写器也是类似,如:FT2232HL等.这种方式,就是可以使用Arduino IDE生态内的软件库,所带来快速开发与测试硬件的优势.也可使用Makefile的方式使用avrdude来烧写.

添加2232HL

  • 打开/etc/avrdude.conf文件发现,里面默认定义了FT2232H,FT4232H的配置如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
~$ cat /etc/avrdude.conf
[...]
programmer
id = "2232HIO";
desc = "FT2232H based generic programmer";
type = "avrftdi";
connection_type = usb;
usbvid = 0x0403;
# Note: This PID is reserved for generic H devices and
# should be programmed into the EEPROM
# usbpid = 0x8A48;
usbpid = 0x6010;
usbdev = "A";
usbvendor = "";
usbproduct = "";
usbsn = "";
#ISP-signals
reset = 3;
sck = 0;
mosi = 1;
miso = 2;
buff = ~4;
#LED SIGNALs
errled = ~ 11;
rdyled = ~ 14;
pgmled = ~ 13;
vfyled = ~ 12;
;

#The FT4232H can be treated as FT2232H, but it has a different USB
#device ID of 0x6011.
programmer parent "avrftdi"
id = "4232h";
desc = "FT4232H based generic programmer";
usbpid = 0x6011;
;
[...]
  • 如上所示,在系统级的avrdude已经支持FT2232H,这里只需要硬件库里programmers.txt添加一个对应到FT2232H的项就可以了。但是一般在硬件库里,还有一份avrdude.conf,按顺序会先是检查硬件库里的相关配置。这里还是以ATTinyCore的硬件库为例:
1
2
3
4
5
6
7
8
9
10
11
12
~$ tail -n 10 ~/.arduino15/packages/ATTinyCore/hardware/avr/1.5.2/programmers.txt


ft2232h.name=2232HIO as ISP
ft2232h.communication=serial
ft2232h.protocol=avrftdi
ft2232h.speed=19200
ft2232h.program.protocol=avrftdi
ft2232h.program.speed=19200
ft2232h.program.tool=avrdude
ft2232h.program.extra_params=-P{serial.port} -b{program.speed}

  • ft2232h连接attiny85
1
2
3
4
5
6
7
8
 FT2232H              ATTiny85

ADBUS0 <------> SCK PB2
ADBUS1 <------> MOSI PB0
ADBUS2 <------> MISO PB1
ADBUS3 <------> Reset PB5
GND <------> GND
+5V <------> +5V (VIN)

AVRDude烧写

  • 前面是在ATTiny85Flash里烧写一个bootloader开启SELFPRGEN Self-Programming EnableSPIEN Enable Serial Program and Data Downloading的功能,优点就是让它能通过USB(D-: PB3/AD3,D+: PB4/AD2)可以烧写程序,可以简单与Arduino IDE集成使用,不需要外接烧写器.缺点就是要消耗2kb的存储空间,但是Attiny85就只有8kb的Flash空间.

  • 下面就是通过使用UM232H像烧写bootloader的方法去开发编程,可以完全使用8kb的空间.下面是一个简单blink示例.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    ~ blink$ cat main.c
    // main.c
    //
    // A simple blinky program for ATtiny85
    // Connect red LED at pin 2 (PB1)
    //
    // electronut.in

    #include <avr/io.h>
    #include <util/delay.h>

    int main (void)
    {
    // set PB1 to be output
    DDRB = 0b00000010;
    while (1) {

    // flash# 1:
    // set PB1 high
    PORTB = 0b00000010;
    _delay_ms(20);
    // set PB1 low
    PORTB = 0b00000000;
    _delay_ms(20);

    // flash# 2:
    // set PB1 high
    PORTB = 0b00000010;
    _delay_ms(200);
    // set PB1 low
    PORTB = 0b00000000;
    _delay_ms(200);
    }

    return 1;
    }
  • Makefile

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    # Makefile for programming the ATtiny85
    # modified the one generated by CrossPack

    DEVICE = attiny85
    CLOCK = 8000000
    PROGRAMMER = -c UM232H
    OBJECTS = main.o
    # for ATTiny85
    # see http://www.engbedded.com/fusecalc/
    FUSES = -U lfuse:w:0x62:m -U hfuse:w:0xdf:m -U efuse:w:0xff:m

    # Tune the lines below only if you know what you are doing:
    AVRDUDE = avrdude $(PROGRAMMER) -p $(DEVICE)
    COMPILE = avr-gcc -Wall -Os -DF_CPU=$(CLOCK) -mmcu=$(DEVICE)

    # symbolic targets:
    all: main.hex

    .c.o:
    $(COMPILE) -c $< -o $@

    .S.o:
    $(COMPILE) -x assembler-with-cpp -c $< -o $@

    .c.s:
    $(COMPILE) -S $< -o $@

    flash: all
    $(AVRDUDE) -U flash:w:main.hex:i

    fuse:
    $(AVRDUDE) $(FUSES)

    # Xcode uses the Makefile targets "", "clean" and "install"
    install: flash fuse

    # if you use a bootloader, change the command below appropriately:
    load: all
    bootloadHID main.hex

    clean:
    rm -f main.hex main.elf $(OBJECTS)

    # file targets:
    main.elf: $(OBJECTS)
    $(COMPILE) -o main.elf $(OBJECTS)

    main.hex: main.elf
    rm -f main.hex
    avr-objcopy -j .text -j .data -O ihex main.elf main.hex
    avr-size --format=avr --mcu=$(DEVICE) main.elf
    # If you have an EEPROM section, you must also create a hex file for the
    # EEPROM and add it to the "flash" target.

    # Targets for code debugging and analysis:
    disasm: main.elf
    avr-objdump -d main.elf

    cpp:
    $(COMPILE) -E main.c
  • 接线按照上面方法,因为这里是没有用使用bootloader,直接在blink目录下运行make flash就通过使用avrdude烧写到flash中.

  • 也可以单独使用下面命令烧写.

    1
    ~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB1 -b 19200 -p attiny85 -U flash:w:main.hex:i

高压编程恢复熔丝位(fuse)锁死

  • Links

  • 恢复熔丝位(fuse)还是有一点麻烦,按照博文High Voltage programming/Unbricking for Attiny指导,需要有一个Arduino设备,或者说至少要一个有6个IO口的单片机.还需要6个1k的电阻,一个(npn)的三极管,一个12V的电压源.如图:

  • high-voltage-programmer.png

  • 读取熔丝位(fuse)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    ~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB1 -b 19200 -p attiny85  -U lfuse:r:-:i -v

    avrdude: Version 6.3-20171130
    Copyright (c) 2000-2005 Brian Dean, http://www.bdmicro.com/
    Copyright (c) 2007-2014 Joerg Wunsch

    System wide configuration file is "/etc/avrdude.conf"
    User configuration file is "/home/michael/.avrduderc"
    User configuration file does not exist or is not a regular file, skipping

    Using Port : /dev/ttyUSB1
    Using Programmer : UM232H
    Overriding Baud Rate : 19200
    AVR Part : ATtiny85
    Chip Erase delay : 4500 us
    PAGEL : P00
    BS2 : P00
    RESET disposition : possible i/o
    RETRY pulse : SCK
    serial program mode : yes
    parallel program mode : yes
    Timeout : 200
    StabDelay : 100
    CmdexeDelay : 25
    SyncLoops : 32
    ByteDelay : 0
    PollIndex : 3
    PollValue : 0x53
    Memory Detail :

    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    eeprom 65 6 4 0 no 512 4 0 4000 4500 0xff 0xff
    flash 65 6 32 0 yes 8192 64 128 4500 4500 0xff 0xff
    signature 0 0 0 0 no 3 0 0 0 0 0x00 0x00
    lock 0 0 0 0 no 1 0 0 9000 9000 0x00 0x00
    lfuse 0 0 0 0 no 1 0 0 9000 9000 0x00 0x00
    hfuse 0 0 0 0 no 1 0 0 9000 9000 0x00 0x00
    efuse 0 0 0 0 no 1 0 0 9000 9000 0x00 0x00
    calibration 0 0 0 0 no 1 0 0 0 0 0x00 0x00

    Programmer Type : avrftdi
    Description : FT232H based module from FTDI and Glyn.com.au

    avrdude: AVR device initialized and ready to accept instructions

    Reading | ################################################## | 100% 0.01s

    avrdude: Device signature = 0x1e930b (probably t85)
    avrdude: safemode: lfuse reads as 62
    avrdude: safemode: hfuse reads as DF
    avrdude: safemode: efuse reads as FE
    avrdude: reading lfuse memory:

    Reading | ################################################## | 100% 0.00s

    avrdude: writing output file "<stdout>"
    :01000000629D
    :00000001FF

    avrdude: safemode: lfuse reads as 62
    avrdude: safemode: hfuse reads as DF
    avrdude: safemode: efuse reads as FE
    avrdude: safemode: Fuses OK (E:FE, H:DF, L:62)

    avrdude done. Thank you.

  • 内存爆掉的问题,定义了一个1024的数组.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    avr-size --format=avr --mcu=attiny85 main.elf
    AVR Memory Usage
    ----------------
    Device: attiny85

    Program: 2454 bytes (30.0% Full)
    (.text + .data + .bootloader)

    Data: 1043 bytes (203.7% Full)
    (.data + .bss + .noinit)

ATTiny85/Atmega328p时钟计数器相关

8位定时器(timer0)

  • 先确认手上的ATtiny85的时钟频是否在运行在8MHz,可以通过读取它的fuse位来判定,这里使用的是-U lfuse:w:0xe2:m -U hfuse:w:0xdf:m -U efuse:w:0xff:m.这里测试用的fuse配置如图

attiny85_fuse.png

  • ATTiny85默认时钟频率是8MHz,表示它可以每秒进行8000000次周期开关(高电平,低电平),每一个周期的时间段(time period)是1/8000000s也就是0.000000125s,也就是125ns,而一个16位的定时器(0-65535),在每个时钟周期加一,从0到上65536上溢只需要8.192ms,8位的定时器只需要0.032ms就上溢了。如果我们需要更长时间的定时间隔,那么就需要预分频器对时钟进行分频处理,根据芯片手册14.9.2 TCCR0B - Timer/Counter Control Register B描述,通过设置TCCRB0B寄存器的Bit2:0位,可以进如下预分频
CS02 CS01 CS00 Description
0 0 0 No clock source (Timer/Counter stopped)
0 0 1 clk I/O /(no prescaling)
0 1 0 clk I/O /8 (from prescaler)
0 1 1 clk I/O /64 (from prescaler)
1 0 0 clk I/O /256 (from prescaler)
1 0 1 clk I/O /1024 (from prescaler)
1 1 0 External clock source on T0 pin. Clock on falling edge.
1 1 1 External clock source on T0 pin. Clock on rising edge.
  • 下面这个程序使用8位定时器来延时1秒,每秒翻转一次状态的blink示例。ATtiny85默认是8MHz,每个周期是1/8MHz = 0.125us = 125ns,按1024预频后得到7812.5Hz。也就是说,定时器每隔7812.5Hz加1,换算成时间是1/7812.5 = 0.000128s,8位定时器只能计数到0.000128s * 255 = 0.032639999999999995

  • 设定T/C0的工作状态为CTC模式,开启T/C0输出比较匹配中断使能位,使用OCR0A比较寄存器保存数值做比较。这里设置OCR0A=250,当TCNT0的值达到250后就会产生比较中断(250 < 255>).中断32次后,也是就约等于1秒钟,并且对PB1的状态进行翻转。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    ~$ cat timer0.c
    #include <avr/io.h>
    #include <avr/interrupt.h>

    int intr_count = 0;
    void setupTimer0() {
    cli();
    // Clear registers
    TCCR0A = 0;
    TCCR0B = 0;

    // 7812.5 Hz (8000000/((0+1)*1024))
    OCR0A = 250; // 0.000128s * 250 = 32ms
    // CTC 比较匹配时清零定时器模式
    TCCR0A |= (1 << WGM01);
    // Prescaler 1024
    TCCR0B |= (1 << CS02) | (1 << CS00);
    // Output Compare Match A Interrupt Enable
    TIMSK |= (1 << OCIE0A);
    sei(); //enabling global interrupt, or SREG |= 0x80
    }

    ISR(TIMER0_COMPA_vect) {
    if(intr_count == 31)
    {
    intr_count = 0;
    PORTB^=(1<<PB1); //toggling the LED
    } else intr_count++;
    }

    int main ()
    {
    DDRB = 0b00000010; // enable PB1
    setupTimer0();
    while(1)
    {}
    }
  • ATtiny85timer1也是一个8bit定时器,但是支持最大14-bit(MAX=16384)的预分频,下面是一个测试。设置比较寄存器的值为248,中断两次逻辑采样得到1.006s的方波。1/488.28125 = 0.002048s,也等于0.000000125s * 16384 = 0.002048s.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
~$ cat timer1.c

int intr_count = 0;
void setupTimer1() {
noInterrupts();
// Clear registers
TCNT1 = 0;
TCCR1 = 0;

// 488.28125 Hz (8000000/((0+1)*16384))
OCR1C = 248;
// interrupt COMPA
OCR1A = OCR1C;
// CTC
TCCR1 |= (1 << CTC1);
// Prescaler 16384
TCCR1 |= (1 << CS13) | (1 << CS12) | (1 << CS11) | (1 << CS10);
// Output Compare Match A Interrupt Enable
TIMSK |= (1 << OCIE1A);
sei();
}

int main ()
{
DDRB = 0b00000010; // enable PB1
setupTimer0();
while(1)
{}
}

ISR(TIMER0_COMPA_vect) {
if(intr_count == 1)
{
intr_count = 0;
PORTB^=(1<<PB1); //toggling the LED
} else intr_count++;
}

16位定时器(timer1)

  • 下面这个程序使用ATmega328p的16位定时器1来延时1秒,ATmega328p默认是16MHz,每个周期是1/16MHz = 0.0625us,把它1024的预分频后得到15625Hz,这里的设置与上面ATtiny85雷同,只是这里使用的是16位定时器(0-65535), 下面把比较器设置成15640通过逻辑采样得到一个1s的方波,而使用15625得到是0.9993s的方波。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
~$ cat timer1.ino
// AVR Timer CTC Interrupts Calculator
// v. 8
// http://www.arduinoslovakia.eu/application/timer-calculator
// Microcontroller: ATmega328P
// Created: 2022-04-27T14:02:45.452Z

#define ledPin 13

void setupTimer1() {
noInterrupts();
// Clear registers
TCCR1A = 0;
TCCR1B = 0;
TCNT1 = 0;

// 15625 Hz (16000000/((0+1)*1024))
OCR1A = 15640;
// CTC
TCCR1B |= (1 << WGM12);
// Prescaler 1024
TCCR1B |= (1 << CS12) | (1 << CS10);
// Output Compare Match A Interrupt Enable
TIMSK1 |= (1 << OCIE1A);
interrupts();
}

void setup() {
pinMode(ledPin, OUTPUT);
setupTimer1();
}

void loop() {
}

ISR(TIMER1_COMPA_vect) {
digitalWrite(ledPin, digitalRead(ledPin) ^ 1);
}

ATmega8-16PU

Link:

添加Arduino IDE支持

ATmega8-16pu连接UM232H

1
2
3
4
5
6
7
 UM232H              ATmega8-16pu       arduino pin out
AD0 (CK) <-------> Pin19 PB5 (SCK) digital pin 13
AD1 (DO) <-------> Pin17 PB3 (MOSI) digital pin 11
AD2 (DI) <-------> Pin18 PB4 (MISO) digital pin 12
AD3 (CS) <-------> Pin1 PC6 (RESET)
GND <-------> Pin8 GND
+5v <-------> Pin7 VCC

读写fuse

  • 下面是读取它的fuse设置。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    ~$ avrdude -c UM232H -P /dev/ttyUSB1 -b 19200 -p m8  -U lfuse:r:-:i -v

    avrdude: Version 6.3-20171130
    Copyright (c) 2000-2005 Brian Dean, http://www.bdmicro.com/
    Copyright (c) 2007-2014 Joerg Wunsch

    System wide configuration file is "/etc/avrdude.conf"
    User configuration file is "/home/michael/.avrduderc"
    User configuration file does not exist or is not a regular file, skipping

    Using Port : /dev/ttyUSB1
    Using Programmer : UM232H
    Overriding Baud Rate : 19200
    AVR Part : ATmega8
    Chip Erase delay : 10000 us
    PAGEL : PD7
    BS2 : PC2
    RESET disposition : dedicated
    RETRY pulse : SCK
    serial program mode : yes
    parallel program mode : yes
    Timeout : 200
    StabDelay : 100
    CmdexeDelay : 25
    SyncLoops : 32
    ByteDelay : 0
    PollIndex : 3
    PollValue : 0x53
    Memory Detail :

    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    eeprom 4 20 128 0 no 512 4 0 9000 9000 0xff 0xff
    flash 33 10 64 0 yes 8192 64 128 4500 4500 0xff 0x00
    lfuse 0 0 0 0 no 1 0 0 2000 2000 0x00 0x00
    hfuse 0 0 0 0 no 1 0 0 2000 2000 0x00 0x00
    lock 0 0 0 0 no 1 0 0 2000 2000 0x00 0x00
    calibration 0 0 0 0 no 4 0 0 0 0 0x00 0x00
    signature 0 0 0 0 no 3 0 0 0 0 0x00 0x00

    Programmer Type : avrftdi
    Description : FT232H based module from FTDI and Glyn.com.au

    avrdude: AVR device initialized and ready to accept instructions

    Reading | ################################################## | 100% 0.01s

    avrdude: Device signature = 0x1e9307 (probably m8)
    avrdude: safemode: lfuse reads as 62
    avrdude: safemode: hfuse reads as DF
    avrdude: reading lfuse memory:

    Reading | ################################################## | 100% 0.00s

    avrdude: writing output file "<stdout>"
    :01000000629D
    :00000001FF

    avrdude: safemode: lfuse reads as 62
    avrdude: safemode: hfuse reads as DF
    avrdude: safemode: Fuses OK (E:FF, H:DF, L:62)

    avrdude done. Thank you.
  • 写入fuse的配置,这里是通过AVR® Fuse Calculator配置计算出来的。

atmega8_16pu_fuse.png

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
~$avrdude -c UM232H -P /dev/ttyUSB1 -b 19200 -p m8   -U lfuse:w:0xd4:m -U hfuse:w:0xc9:m

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9307 (probably m8)
avrdude: reading input file "0xd4"
avrdude: writing lfuse (1 bytes):

Writing | ################################################## | 100% 0.01s

avrdude: 1 bytes of lfuse written
avrdude: verifying lfuse memory against 0xd4:
avrdude: load data lfuse data from input file 0xd4:
avrdude: input file 0xd4 contains 1 bytes
avrdude: reading on-chip lfuse data:

Reading | ################################################## | 100% 0.00s

avrdude: verifying ...
avrdude: 1 bytes of lfuse verified
avrdude: reading input file "0xc9"
avrdude: writing hfuse (1 bytes):

Writing | ################################################## | 100% 0.00s

avrdude: 1 bytes of hfuse written
avrdude: verifying hfuse memory against 0xc9:
avrdude: load data hfuse data from input file 0xc9:
avrdude: input file 0xc9 contains 1 bytes
avrdude: reading on-chip hfuse data:

Reading | ################################################## | 100% 0.00s

avrdude: verifying ...
avrdude: 1 bytes of hfuse verified

avrdude: safemode: Fuses OK (E:FF, H:C9, L:D4)

avrdude done. Thank you.

基于Rust语言开发测试

Prerequistes

1
~$ sudo apt install binutils-avr avr-libc gcc-avr pkg-config avrdude libudev-dev

Install Micronucleus (Optional)

  • ft2232h Link to Lilytiny Attiny85
1
2
3
4
5
6
7
8
 FT2232H              ATTiny85

ADBUS0 <------> SCK PB2
ADBUS1 <------> MOSI PB0
ADBUS2 <------> MISO PB1
ADBUS3 <------> Reset PB5
GND <------> GND
+5V <------> VIN
  • Build and Flash firmware
1
2
3
~$ git clone https://github.com/micronucleus/micronucleus
~$ cd micronucleus
~$ make PROGRAMMER="-c UM232H -P /dev/ttyUSB0 -b 19200" flash
  • Build micronucleus flash CLI tool
1
2
3
~$ sudo apt-get install libusb-dev
~$ cd micronucleus/commandline
~$ make && cp micronucleus /usr/bin
  • Connected to board USB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
~$ lsusb -v -s 001:030

Bus 001 Device 030: ID 16d0:0753 MCS Digistump DigiSpark
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 1.10
bDeviceClass 255 Vendor Specific Class
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 8
idVendor 0x16d0 MCS
idProduct 0x0753 Digistump DigiSpark
bcdDevice 2.06
iManufacturer 0
iProduct 0
iSerial 0
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 0x0012
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0x80
(Bus Powered)
MaxPower 100mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 0
bInterfaceClass 0
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 0
Device Status: 0xffff
Self Powered
Remote Wakeup Enabled
Test Mode
Debug Mode

Install Rust env

1
~$ cargo +stable install ravedude
1
2
3
4
5
~$ git clone https://github.com/Rahix/avr-hal
~$ cd avr-hal/example/trinket
~$ cargo build
~$ avr-objcopy --output-target=ihex ../../target/avr-attiny85/debug/trinket-simple-pwm.elf ../../target/avr-attiny85/debug/trinket-simple-pwm.hex
~$ micronucleus --timeout 60 --run --no-ansi ../../target/avr-attiny85/debug/trinket-simple-pwm.hex

Build release Minimzing Rust Binary Size test

  • added following lines into the Cargo.toml
1
2
3
4
5
6
7
[profile.release]
strip = true
opt-level = "z" # Optimize for size.
lto = true
panic = "abort"
debug = false

  • Build Release
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
~$ RUSTFLAGS="-Zlocation-detail=none" cargo build --release
warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
package: /fullpath//github/AVR/avr-hal/examples/trinket/Cargo.toml
workspace: /fullpath//github/AVR/avr-hal/Cargo.toml
Compiling compiler_builtins v0.1.98
Compiling core v0.0.0 (/home/michael/.rustup/toolchains/nightly-2023-08-08-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core)
Compiling proc-macro2 v1.0.69
Compiling unicode-ident v1.0.12
Compiling syn v1.0.109
Compiling proc-macro-hack v0.5.20+deprecated
Compiling rustversion v1.0.14
Compiling paste v1.0.14
Compiling quote v1.0.33
Compiling avr-hal-generic v0.1.0 (/fullpath//github/AVR/avr-hal/avr-hal-generic)
Compiling avr-device-macros v0.5.2
Compiling ufmt-macros v0.1.1 (https://github.com/Rahix/ufmt.git?rev=12225dc1678e42fecb0e8635bf80f501e24817d9#12225dc1)
Compiling rustc-std-workspace-core v1.99.0 (/home/michael/.rustup/toolchains/nightly-2023-08-08-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/rustc-std-workspace-core)
Compiling nb v1.1.0
Compiling ufmt-write v0.1.0 (https://github.com/Rahix/ufmt.git?rev=12225dc1678e42fecb0e8635bf80f501e24817d9#12225dc1)
Compiling bare-metal v1.0.0
Compiling vcell v0.1.3
Compiling cfg-if v1.0.0
Compiling void v1.0.2
Compiling cfg-if v0.1.10
Compiling embedded-storage v0.2.0
Compiling panic-halt v0.2.0
Compiling ufmt v0.1.0 (https://github.com/Rahix/ufmt.git?rev=12225dc1678e42fecb0e8635bf80f501e24817d9#12225dc1)
Compiling avr-device v0.5.2
Compiling nb v0.1.3
Compiling embedded-hal v0.2.7
Compiling attiny-hal v0.1.0 (/fullpath//github/AVR/avr-hal/mcu/attiny-hal)
Compiling arduino-hal v0.1.0 (/fullpath//github/AVR/avr-hal/arduino-hal)
Compiling trinket-examples v0.0.0 (/fullpath//github/AVR/avr-hal/examples/trinket)
WARN rustc_codegen_ssa::back::link Linker does not support -no-pie command line option. Retrying without.
WARN rustc_codegen_ssa::back::link Linker does not support -no-pie command line option. Retrying without.
Finished release [optimized + debuginfo] target(s) in 6.60s

驱动NRF24l01

Links:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
CONFIG:    0b
EN_AA: 3e
EN_RXADDR: 01
SETUP_AW: 02
SETUP_RETR:30
RF_CH: 02
RF_SETUP: 05
STATUS: 0e
OBS_TX: 00
TX_ADDR: 71917d6b
CD: 00
RX_PW_P0: 20
RX_PW_P1: 00
RX_PW_P2: 00
RX_PW_P3: 00
RX_PW_P4: 00
RX_PW_P5: 00
FIFO_STAT: 11
DYNPD: 00
FEATURE: 05

驱动SSD1306

调试

GDB

逻辑分析仪的问题

字符相关

AVR-GCC汇编相关

GCC asm Statement

  • Let’s start with a simple example of reading a value from port D:

    1
    asm("in %0, %1" : "=r" (value) : "I" (_SFR_IO_ADDR(PORTD)) );

    Eachasmstatement is devided by colons into (up to) four parts:

    1. The assembler instructions, defined as a single string constant: "in %0, %1"
    2. A list of output operands, separated by commas. Our example uses just one: "=r" (value)
    3. A comma separated list of input operands. Again our example uses one operand only:
      "I" (_SFR_IO_ADDR(PORTD))
    4. Clobbered registers, left empty in our example.
  • You can write assembler instructions in much the same way as you would write assembler programs. However, registers andconstants are used in a different way if they refer to expressions of your C program. The connection between registersand C operands is specified in the second and third part of the asm instruction, the list of input and output operands,respectively. The general form is

1
asm(code : output operand list : input operand list [: clobber list]);
  • In the code section, operands are referenced by a percent sign followed by a single digit. %0 refers to the first %1 tothe second operand and so forth. From the above example:
1
2
%0 refers to "=r" (value) and
%1 refers to "I" (_SFR_IO_ADDR(PORTD)).

Input and Output Operands

  • Each input and output operand is described by a constraint string followed by a C expression in parantheses. AVR-GCC 3.3knows the following constraint characters:
  • Note
    • The most up-to-date and detailed information on contraints for the avr can be found in the gcc manual.
    • The x register is r27:r26, the y register is r29:r28, and the z register is r31:r30
Constraint Used for Range
a Simple upper registers r16 to r23
b Base pointer registers pairs y, z
d Upper register r16 to r31
e Pointer register pairs x, y, z
q Stack pointer register SPH:SPL
r Any register r0 to r31
t Temporary register r0
w Special upper register pairs r24, r26, r28, r30
x Pointer register pair X x (r27:r26)
y Pointer register pair Y y (r29:r28)
z Pointer register pair Z z (r31:r30)
G Floating point constant 0.0
I 6-bit positive integer constant 0 to 63
J 6-bit negative integer constant -63 to 0
K Integer constant 2
L Integer constant 0
l Lower registers r0 to r15
M 8-bit integer constant 0 to 255
N Integer constant -1
O Integer constant 8, 16, 24
P Integer constant 1
Q (GCC >= 4.2.x) A memory address based on Y or Z pointer with displacementa.
R (GCC >= 4.3.x) Integer constant. -6 to 5
Mnemonic Constraints Mnemonic Constraints
adc r,r add r,r
adiw w,I and r,r
andi d,M asr r
bclr I bld r,I
brbc I,label brbs I,label
bset I bst r,I
cbi I,I cbr d,I
com r cp r,r
cpc r,r cpi d,M
cpse r,r dec r
elpm t,z eor r,r
in r,I inc r
ld r,e ldd r,b
ldi d,M lds r,label
lpm t,z lsl r
lsr r mov r,r
movw r,r mul r,r
neg r or r,r
ori d,M out I,r
pop r push r
rol r ror r
sbc r,r sbci d,M
sbi I,I sbic I,I
sbiw w,I sbr d,M
sbrc r,I sbrs r,I
ser d st e,r
std b,r sts label,r
sub r,r subi d,M
swap r
  • Constraint characters may be prepended by a single constraint modifier. Contraints without a modifier specify read-only operands. Modifiers are:
  • Modifier Specifies
1
2
3
= 	Write-only operand, usually used for all output operands.
+ Read-write operand
& Register should be used for output only

comment In assembler programming, the term clobbered registers is used to denote any registers whose value may beoverwritten during the course of executing an instruction or procedure.

谢谢支持

  • 微信二维码:

Docker Machine

简介

  • Docker MachineDocker官方编排(Orchestration)项目之一,负责在多种平台上快速安装Docker环境Docker Machine项目基于Go语言实现,目前在Github上进行维护.用Docker Machine可以批量安装和配置docker host,这个host可以是本地的虚拟机,物理机,也可以是公有云中的云主机.Docker Machine支持在不同的环境下安装配置docker host,包括:
    • (1) 常规Linux操作系统.
    • (2) 虚拟化平台—VirtualBox,VMWare,Hyper-V,OpenStack.
    • (3) 公有云— Amazon Web Services,Microsoft Azure,Google Compute Engine,Digital Ocean等.
  • Docker Machine为这些环境起了一个统一的名字:provider.对于某个特定的provider,Docker Machine使用相应的driver安装和配置
    docker host.

脚本安装

1
2
3
4
5
6
7
8
9
10
# 安装
~$ base=https://github.com/docker/machine/releases/download/v0.16.0 &&
curl -L $base/docker-machine-$(uname -s)-$(uname -m) >/tmp/docker-machine &&
sudo install /tmp/docker-machine /usr/local/bin/docker-machine
# 查看版本
~$ docker-machine version
docker-machine version 0.16.0, build 702c267f
# 列出主机列表
~$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS

创建Machine

  • 下面命令就会在本机的VirtualBox里创建一个名为default的虚拟机.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    $ docker-machine  create -d virtualbox default
    Running pre-create checks...
    Creating machine...
    (default) Copying /home/lcy/.docker/machine/cache/boot2docker.iso to /home/lcy/.docker/machine/machines/default/boot2docker.iso...
    (default) Creating VirtualBox VM...
    (default) Creating SSH key...
    (default) Starting the VM...
    (default) Check network to re-create if needed...
    (default) Waiting for an IP...
    Waiting for machine to be running, this may take a few minutes...
    Detecting operating system of created instance...
    [...]
    Docker is up and running!
    To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env default
    ~$ docker-machine create -d virtualbox manager1 &&
    > docker-machine create -d virtualbox manager2 &&
    > docker-machine create -d virtualbox worker1 &&
    > docker-machine create -d virtualbox worker2
    ~$ docker-machine ls
    NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
    default - virtualbox Running tcp://192.168.99.100:2376 v18.09.1
    manager1 - virtualbox Running tcp://192.168.99.101:2376 v18.09.1
    manager2 - virtualbox Running tcp://192.168.99.102:2376 v18.09.1
    worker1 - virtualbox Running tcp://192.168.99.103:2376 v18.09.1
    worker2 - virtualbox Running tcp://192.168.99.104:2376 v18.09.1

Docker端的操作

1
2
# 更新worker1,worker2的docker到最新版本,可以指执行.
~$ docker-machine upgrade worker1 worker2
  • 为了支持ansible管理操作,下面安装一些python环境相关的软件.参考链接.tce的安装目录是docker@manager1:/usr/local/tce.installed,比如:要先删除python旧的安装的命令:docker-machine ssh manager1 "rm -rf /usr/local/tce.installed/python",再用下面命令重新安装.这里使用的bash脚本的for循环串行处理,如果要并行处理,可以参考parallel.

安装python环境包

1
~$  for item in manager1 manager2 worker1 worker2; do docker-machine ssh $item "tce-load -wi python && curl https://bootstrap.pypa.io/get-pip.py | sudo python - && sudo ln -s /usr/local/bin/python /usr/bin/python";done

Ansible管理连接

  • 创建主机文件,因为每一个主机的私钥位置不同.所以在hosts.txt中具体指定如下.

    1
    2
    3
    4
    5
    6
    7
    ~$ cat hosts.txt
    [swarm]
    192.168.99.100 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/default/id_rsa ansible_python_interpreter=/usr/local/bin/python
    192.168.99.101 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/manager1/id_rsa ansible_python_interpreter=/usr/local/bin/python
    192.168.99.102 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/manager2/id_rsa ansible_python_interpreter=/usr/local/bin/python
    192.168.99.103 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/worker1/id_rsa ansible_python_interpreter=/usr/local/bin/python
    192.168.99.104 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/worker2/id_rsa ansible_python_interpreter=/usr/local/bin/python
  • 连接测试.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    ~$ ansible -i hosts.txt all -u docker -m ping
    192.168.99.102 | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }
    192.168.99.100 | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }
    192.168.99.103 | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }
    192.168.99.104 | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }
    192.168.99.101 | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }

配置Docker Swarm集群节点

配置Swarm管理节点

1
2
3
4
5
6
7
8
~$ docker-machine ssh manager1 "docker swarm init --advertise-addr 192.168.99.101"
Swarm initialized: current node (slf4m19dsk0cvo6wcpxjwm10v) is now a manager.

To add a worker to this swarm, run the following command:

docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-dsf3agxhdq6prycves2cxg16w 192.168.99.101:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

配置工作节点

  • worker1,worker2创建成工作节点,并加入到集群中.
    1
    2
    3
    4
    5
    ~$ docker-machine ssh worker1 "docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-dsf3agxhdq6prycves2cxg16w 192.168.99.101:2377"
    This node joined a swarm as a worker.

    ~$ docker-machine ssh worker2 "docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-dsf3agxhdq6prycves2cxg16w 192.168.99.101:2377"
    This node joined a swarm as a worker.

添加备用管理节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 查看加入swarm集群管理节点的token
~$ docker-machine ssh manager1 "docker swarm join-token manager"
To add a manager to this swarm, run the following command:

docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-1pg2xuizss0074mw3xgf59b5t 192.168.99.101:2377
# 把manager2加入swarm管理节点,做为备用候选管理节点
~$ docker-machine ssh manager2 "docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-1pg2xuizss0074mw3xgf59b5t 192.168.99.101:2377"
This node joined a swarm as a manager.
# 查看swarm集群节点信息.
~$ docker-machine ssh manager1 "docker node ls"
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
slf4m19dsk0cvo6wcpxjwm10v * manager1 Ready Active Leader 18.09.1
5navxdx9kkvph4elfb2dcuiee manager2 Ready Active Reachable 18.09.1
wsvs8cb6sbroj0wuz09z9vrdj worker1 Ready Active 18.09.1
qkwz8akr2ef8oe5kddibmglfs worker2 Ready Active 18.09.1

创建docker私有仓库

  • 通过下命令在本机创建一个私有的镜像仓库.

    1
    2
    3
    4
    5
    6
    7
    8
    ~$ docker run -d -v /data/docker-registry:/var/lib/registry -p 5000:5000 --restart=always --name registry registry
    Unable to find image 'registry:latest' locally
    latest: Pulling from library/registry
    cd784148e348: Pull complete
    [...]
    Digest: sha256:a54bc9be148764891c44676ce8c44f1e53514c43b1bfbab87b896f4b9f0b5d99
    Status: Downloaded newer image for registry:latest
    242af2d15586d2d571c46c5edf821ce958cf22139d957e52a6f5d959726957bf
  • 接下来,将本地已有的镜像文件推送到刚才新建的仓库中去.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    ~$ docker ps
    CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
    242af2d15586 registry "/entrypoint.sh /etc…" About a minute ago Up About a minute 0.0.0.0:5000->5000/tcp registry
    8ee3d7fb435f redis "docker-entrypoint.s…" 4 months ago Up 3 hours 0.0.0.0:6379->6379/tcp redis
    608b60a022e8 postgres:9.6 "docker-entrypoint.s…" 4 months ago Up 3 hours 0.0.0.0:5432->5432/tcp pg96

    # 把本地的postgres:9.6版本的镜像,改成192.168.99.1:5000/postgres:v3标签
    ~$ docker tag postgres:9.6 192.168.99.1:5000/postgres:v3
    docker images
    REPOSITORY TAG IMAGE ID CREATED SIZE
    registry latest 116995fd6624 5 days ago 25.8MB
    127.0.0.1:5000/postgres 9.6 0178d5af9576 5 months ago 229MB
    192.168.99.1:5000/postgres 9.6 0178d5af9576 5 months ago 229MB
    postgres 9.6 0178d5af9576 5 months ago 229MB
    # 推送镜像.
    ~$ docker push 192.168.99.1:5000/postgres:v3
    The push refers to repository [192.168.99.1:5000/postgres]
    10cb36af78fe: Pushed
    [...]
    v3: digest: sha256:86a7984760c1d36c7c9ebec73706f05d76e7615937a45ae0d110b2112fd5cbfa size: 3245
    ~$ curl http://192.168.99.1:5000/v2/_catalog
    {"repositories":["postgres"]}
    ~$ curl http://192.168.99.1:5000/v2/postgres/tags/list
    {"name":"postgres","tags":["v3"]}
  • 简单的从公网下载镜推进私有库中.

    1
    2
    3
    4
    5
    ~$ docker pull dockersamples/visualizer
    ~$ docker tag dockersamples/visualizer 192.168.99.1:5000/visualizer:v4
    ~$ docker push 192.168.99.1:5000/visualizer:v4
    ~$ curl http://192.168.99.1:5000/v2/_catalog
    {"repositories":["postgres","visualizer"]}

- 通过 ansible 命令,在4个 hosts 节点中的 docker 启动参数中加入前面创建的私有仓库地址.

1
2
3
~$ ansible -i hosts.txt all -u docker -b  -m lineinfile -a "path=/etc/docker/daemon.json line='{\n\t\t\"insecure-registries\":    [\"192.168.99.1:5000\"]\n}' create=yes"
# 重启所有节点
~$ docker-machine restart manager1 manager2 worker2 worker1

Docker Service部署单个集群服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
~$ docker-machine ssh manager1 "docker service create --replicas 4 -p 15432:5432 --name pgsql 192.168.99.1:5000/postgres:v3"
yt4u3vmyq3gp9pc2bgowti1lf
overall progress: 0 out of 4 tasks
[....]
verify: Service converged

~$ docker-machine ssh manager1 "docker service ls"
ID NAME MODE REPLICAS IMAGE PORTS
yt4u3vmyq3gp pgsql replicated 4/4 192.168.99.1:5000/postgres:v3 *:15432->5432/tcp
~$ docker-machine ssh manager1 "docker service ps pgsql"
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
zj09wnhy4utz pgsql.1 192.168.99.1:5000/postgres:v3 worker1 Running Running 27 seconds ago
l7feljripf38 pgsql.2 192.168.99.1:5000/postgres:v3 manager2 Running Running 27 seconds ago
omxdx1cw8c7x pgsql.3 192.168.99.1:5000/postgres:v3 worker2 Running Running 27 seconds ago
tkghl3px2l08 pgsql.4 192.168.99.1:5000/postgres:v3 manager1 Running Running 26 seconds ago

# 登录测试一下
~$ psql -h 192.168.99.101 -p 15432 -U postgres -W
Password for user postgres:
psql (10.6 (Debian 10.6-1.pgdg90+1), server 9.6.11)
Type "help" for help.

Docker Stack部署多个集群服务

  • 确保下列的image是在本的仓库中找得到的.确认镜像

    1
    2
    3
    4
    ~$ docker images | awk '$2 ~/v4/ {print}'
    192.168.99.1:5000/nginx v4 42b4762643dc 34 hours ago 109MB
    192.168.99.1:5000/visualizer v4 f6411ebd974c 3 weeks ago 166MB
    192.168.99.1:5000/portainer v4 a01958db7424 6 weeks ago 72.2MB
  • docker-compose.yml

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    version: '3'

    services:
    nginx:
    image: 192.168.99.1:5000/nginx:v4
    ports:
    - 8088:80
    deploy:
    mode: replicated
    replicas: 4

    visualizer:
    image: 192.168.99.1:5000/visualizer:v4
    ports:
    - '8080:8080'
    volumes:
    - '/var/run/docker.sock:/var/run/docker.sock'
    deploy:
    replicas: 1
    placement:
    constraints: [node.role == manager]

    portainer:
    image: 192.168.99.1:5000/portainer:v4
    ports:
    - '9000:9000'
    volumes:
    - '/var/run/docker.sock:/var/run/docker.sock'
    deploy:
    replicas: 1
    placement:
    constraints: [node.role == manager]

创建服务

1
2
3
4
5
~$ docker-machine ssh manager1 "docker stack deploy -c docker-compose.yml deploy-demo"
Creating network deploy-demo_default
Creating service deploy-demo_visualizer
Creating service deploy-demo_portainer
Creating service deploy-demo_nginx

管理阿里云ECS

阿里云官方驱动

  • 阿里云 Docker Machine 驱动
  • 阿里云 ECS Docker Machine Driver 入门指南
  • 按照上述在本地的VirtualBox创建主机的类比,使用阿里云官方的驱动,需要阿里的的安全秘钥以及对应的Region信息.如果想创建VPC网络,还需要有
    VPC IDVSwitch ID.操作起稍显麻烦,而且--aliyunecs-access-key-id,--aliyunecs-access-key-secret这两个参数权限是相当重要,这个只在以后工作的具体应用中有使用需要时,再去了解它的功能.

generic驱动

  • 下面虽然没有阿里云driver但有一个generic driver,可通过ssh管理现有的机器,原则上所有的Linux机器都支持.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    ~$ docker-machine create --driver generic --generic-ip-address DB001 --generic-ssh-user lcy  --generic-ssh-key $HOME/.ssh/id_rsa   aliyun-machine
    Running pre-create checks...
    Creating machine...
    (aliyun-machine) Importing SSH key...
    Waiting for machine to be running, this may take a few minutes...
    Detecting operating system of created instance...
    Waiting for SSH to be available...
    Detecting the provisioner...
    Provisioning with debian...
    Copying certs to the local machine directory...
    Copying certs to the remote machine...
    Setting Docker configuration on the remote daemon...
    Checking connection to Docker...
    Docker is up and running!
    To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env aliyun-machine
    ~$ docker-machine env aliyun-machine
    export DOCKER_TLS_VERIFY="1"
    export DOCKER_HOST="tcp://DB001:2376"
    export DOCKER_CERT_PATH="/home/lcy/.docker/machine/machines/aliyun-machine"
    export DOCKER_MACHINE_NAME="aliyun-machine"
    # Run this command to configure your shell:
    # eval $(docker-machine env aliyun-machine)

错误

  • 要在/etc/docker/daemon.json加入{ "insecure-registries": ["192.168.99.1:5000"] }这一行,不然会出现下面的错误.

    1
    2
    ~$ docker pull 192.168.99.1:5000/postgres:9.6
    Error response from daemon: Get https://192.168.99.1:5000/v2/: http: server gave HTTP response to HTTPS client
  • ansible连接到boot2docker镜像无python环境的错误

    1
    2
    3
    4
    5
    6
    7
    192.168.99.102 | FAILED! => {
    "changed": false,
    "module_stderr": "Shared connection to 192.168.99.102 closed.\r\n",
    "module_stdout": "/bin/sh: /usr/local/bin/python: not found\r\n",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
    }

Ansible

安装

1
~$ pip install ansible ansible-lint

命令格式

  • ansible <host-pattern> [options],Inventory中定义的主机或主机组,可以为ip,hostname,Inventory中的group组名,具有".","\*".":"等特殊字符的匹配型字符串.<>;表示为必须参数,\[\]表示为可选参数.
  • 比如使用apt模块,root用户,更新系统的命令.后面会讲基于YMAL格式的配置文件.
    1
    ~$ ansible -i ~/.ansible/hosts all -u root -m apt -a "upgrade=yes update_cache=yes cache_valid_time=86400"
  • 下例使用authorized_key模块为用户添加公钥匙,exclusive=True就为替换现有的 key.下例为:在原有的 key 附加新的 key.
    1
    2
    3
    ~$ ansible -i ~/.ansible/hosts all -u lcy -m authorized_key -a "user=lcy state=present  key='ssh-rsa AAAAB3NzaC1yc...... user@gentoo'"
    # 在主机的hosts添加
    ~$ ansible -i ~/.ansible/hosts all -u lcy -b -m lineinfile -a "path=/etc/hosts create=yes line='127.0.0.1\tlocalhost\n172.18.127.186\tDB001\n172.18.192.77\tWeb001\n172.18.253.222\tFE001\n172.18.192.76\tDIG001'"

查看系统信息

1
2
3
4
5
6
7
# 这里是调用command这个模块,运行系统命令.
~$ ansible -i ~/.ansible/hosts Web001 -u root -m command -a 'lsb_release -a'
120.77.xxx.xx | CHANGED | rc=0 >>
Distributor ID: Debian
Description: Debian GNU/Linux 9.6 (stretch)
Release: 9.6
Codename: stretchNo LSB modules are available.

命令工具

ansible-doc

  • 使用ansible-doc可以查看所有支持的模块文档,因为我这里使用的是Debian,这里把它相关的包管理模块列出来.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    ~$ ansible-doc  -l | grep "apt"
    apt Manages apt-packages
    apt_key Add or remove an apt key
    apt_repository Add and remove APT repositories
    apt_rpm apt_rpm package manager
    na_ontap_ucadapter NetApp ONTAP UC adapter configuration
    nios_naptr_record Configure Infoblox NIOS NAPTR records

    # 可以查该模块的参数与使用示例.也可以打开网页文档[apt_module](https://docs.ansible.com/ansible/latest/modules/apt_module.html)
    ~$ ansible-doc apt
    > APT (/home/lcy/.pyenv/versions/3.6.6/envs/py3dev/lib/python3.6/site-packages/ansible/modules/packaging/os/apt.py)

    Manages \`apt\' packages (such as for Debian/Ubuntu).

    OPTIONS (= is mandatory):

    - allow_unauthenticated
    Ignore if packages cannot be authenticated. This is useful for bootstrapping environments that manage their own apt-key setup.
    \`allow_unauthenticated\' is only supported with state: \`install\'/\`present\'
    [Default: no]
    type: bool
    version_added: 2.1
    [....]

ansible-playbook

  • Ansiable的任务配置文件被称为Playbook,可以称之为”剧本”,Playbook具有编写简单,可定制性高,灵活方便,以及可固化日常所有操作的特点.在下面的docker 安装用一个真实的完整实例演示.

配置

  • 默认配置文件名为ansible.cfg,它可以存在于很多地方,默认是/etc/ansible.cfg与用户目录下的~/.ansible/ansible.cfg.Ad-Hoc,Ansible-playbook,前者是临时命令的执行,后者是Ad-Hoc的集合,相当于是一个脚本.
  • 下面是一个经典的hosts文件,它默认找的位置是/etc/ansible/hosts,这里是用户目录.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    ~$ cat ~/.ansible/hosts
    [Web001]
    120.77.xxx.xx

    [DB001]
    119.23.xx.xxx

    [DIG001]
    120.78.xx.xxx

    [FE001]
    112.74.xxx.xx

    ~$ ansible -i ~/.ansible/hosts all -m ping -u root
    120.77.xxx.xx | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }
    119.23.xx.xxx | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }
    112.74.xx.xxx | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }
    120.78.xxx.xx | SUCCESS => {
    "changed": false,
    "ping": "pong"
    }

ansible-vault

  • ansible-vault主要用于配置文件加密,如编写要的Playbook配置文件里包含敏感信息.参考这里
1
2
3
4
5
6
7
8
9
# 加密后的a.yaml打开全是乱码.
~$ ansible-vault encrypt a.yaml
New Vault password:
Confirm New Vault password:
Encryption successful
# 解密a.yaml
~$ ansible-vault decrypt a.yaml
Vault password:
Decryption successful

ansible-galaxy

  • Ansible Galaxy 文档

  • 这个跟三星手机没有任何关系 ,可以把它简单地理解为github或者pip的功能.主要是用来生成,查找,安装一些优秀的Roles.一些优质的Roles可以在这里找到.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    ~$  ansible-galaxy --help
    Usage: ansible-galaxy [delete|import|info|init|install|list|login|remove|search|setup] [--help] [options] ...

    Perform various Role related operations.

    Options:
    -h, --help show this help message and exit
    -c, --ignore-certs Ignore SSL certificate validation errors.
    -s API_SERVER, --server=API_SERVER
    The API server destination
    -v, --verbose verbose mode (-vvv for more, -vvvv to enable
    connection debugging)
    --version show program's version number and exit

    See 'ansible-galaxy <command> --help' for more information on a specific
    command.
  • 安装postgresql Roles,以及它的目录结构.如果要创建自定义的Roles,可以参考这些结构.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    ~$ ansible-galaxy install geerlingguy.postgresql
    - downloading role 'postgresql', owned by geerlingguy
    - downloading role from https://github.com/geerlingguy/ansible-role-postgresql/archive/1.4.5.tar.gz
    - extracting geerlingguy.postgresql to /home/lcy/.ansible/roles/geerlingguy.postgresql
    - geerlingguy.postgresql (1.4.5) was installed successfully
    ~$ tree /home/lcy/.ansible/roles/geerlingguy.postgresql/
    /home/lcy/.ansible/roles/geerlingguy.postgresql/
    ├── defaults
    │ └── main.yml
    ├── handlers
    │ └── main.yml
    ├── LICENSE
    ├── meta
    │ └── main.yml
    ├── molecule
    │ └── default
    │ ├── molecule.yml
    │ ├── playbook.yml
    │ ├── tests
    │ │ └── test_default.py
    │ └── yaml-lint.yml
    ├── README.md
    ├── tasks
    │ ├── configure.yml
    │ ├── databases.yml
    │ ├── initialize.yml
    │ ├── main.yml
    │ ├── setup-Debian.yml
    │ ├── setup-RedHat.yml
    │ ├── users.yml
    │ └── variables.yml
    ├── templates
    │ ├── pg_hba.conf.j2
    │ └── postgres.sh.j2
    └── vars
    ├── Debian-7.yml
    ├── Debian-8.yml
    ├── Debian-9.yml
    ├── RedHat-6.yml
    ├── RedHat-7.yml
    ├── Ubuntu-14.yml
    ├── Ubuntu-16.yml
    └── Ubuntu-18.yml

    9 directories, 27 files

通过Ansiable安装docker

  • Serve Static Files by Nginx from Django using Docker
    - apt - Manages apt-packages - Get Docker CE for Debian - 这里参照Get Docker CE for Debian,把它的安装流程转换成一个Playbook文件来执行.这种安装方式,各个被安装的目标主机间的 docker 是相互独立的,如果要把它们集合起来做编排,要使用Docker Machine.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    ---
    - name: 安装基础软件
    hosts: all
    become: yes
    # user: root 这里可以直接用root,但是关闭root远程登录后要使用sudo.
    tasks:
    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_module.html
    - name: 更新并安装
    apt:
    name: ['apt-transport-https', 'ca-certificates', 'curl', 'software-properties-common', tmux]
    allow_unauthenticated: yes
    update_cache: yes

    # 这是另一种安装软件列表的方式
    - name: Install a list of packages
    apt:
    name: '{{ packages }}'
    update_cache: yes
    vars:
    packages:
    - git
    - rsync
    - gcc
    - dirmngr
    - bwn-ng
    - tmux
    - tree

    - name: 更新并安装 gnupg2
    apt:
    name: gnupg2
    allow_unauthenticated: no
    update_cache: yes

    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_key_module.html?highlight=apt%20key
    - name: 添加新的公钥
    apt_key:
    url: https://download.docker.com/linux/debian/gpg
    state: present

    # 参考文档 https://docs.ansible.com/ansible/latest/modules/command_module.html#command-module
    - name: 读取系统发行版本号
    command: lsb_release -sc
    register: result

    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_repository_module.html?highlight=add%20apt%20repository
    - apt_repository:
    repo: deb [arch=amd64] https://download.docker.com/linux/debian {{ result.stdout }} stable
    state: present
    filename: docker-ce

    # 安装docker-ce,docker-compose
    - name: 更新并安装 'docker-ce', ,'bridge-utils'
    apt:
    name: ['docker-ce', 'bridge-utils']
    allow_unauthenticated: yes
    update_cache: yes

    - name:
    command: uname -s
    register: vendor

    - name:
    command: uname -m
    register: arch

    # 安装最新版本的docker-compose,使用apt安装的版本很老.
    - name: 安装最新版本的docker-compose-1.23.2
    get_url:
    url: https://github.com/docker/compose/releases/download/1.23.2/docker-compose-{{ vendor.stdout }}-{{ arch.stdout }}
    dest: /usr/local/bin/docker-compose
    mode: 0755

    - name: 重启机器
    reboot:
    reboot_timeout: 3600
  • 重启之后,应该就可以用下面命令行查看docker信息了.

1
~$ ansible -i ~/.ansible/hosts all -u lcy  -m command -a "docker info"
  • 注意:如不安装bridge-utils并重启,docker无法启动,可能会出现如下错误.
    1
    2
    3
    4
    Dec 29 10:26:16 DB001 dockerd[20493]: time="2018-12-29T10:26:16.760508487+08:00" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
    Dec 29 10:26:16 DB001 dockerd[20493]: Error starting daemon: Error initializing network controller: Error creating default "bridge" network: package not installed
    Dec 29 10:26:16 DB001 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
    Dec 29 10:26:16 DB001 systemd[1]: Failed to start Docker Application Container Engine.

添加用户与组

  • 关于与开启用户的sudo功能,有两种方法,一是可以修改远程主机的/etc/sudoers文件.可以参照这里 也可以使用复制替换的方法,参照这里.也可以把用户加进sudo组里.

  • 如果要每一个用户添加一个初始密码,可以参照这里.Linux 下可以使用mkpasswd命令,如果其它系统,可以使用passlib这个 python的包来替换.如要使用sudo不用密码,要添加NOPASSWORD关键字.

    1
    2
    3
    4
    5
    6
    7
    ~$ mkpasswd --method=sha-512
    Password:
    $6$5QYVZSmH7$FXQcAQ8FsjMVk0x.ATQgpFHhgImp7hdITMh7zAE.VeAkQYDzdFAOxx6jqVFOY.52nRW4a6SjzEUnK.JSh73W61

    ~$ python -c "from passlib.hash import sha512_crypt; import getpass; print(sha512_crypt.using(rounds=5000).hash(getpass.getpass()))"
    Password:
    $6$J15B6vXoZeekBlVy$.F6PelDYQRCeqapZ2/V3BQ5IjJXCdhG4g5NgoeNvnGJqf1dValk38IDzBuMfmctLMgQ4llyzVT3WN4pYrIpmZ0
  • 下面是添加两个用户,并加入相关组,以及修改sshd_config的相关参数.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    ---
    - name: 添加用户
    hosts: all
    user: root
    tasks:
    - name: remove users
    user:
    name: '{{ item }}'
    state: absent # absent 表示移除,present 表示添加
    remove: yes
    with_items:
    - lcy
    - gavin_kou

    - name: add the user "{{ item }}"
    # 参考链接 https://docs.ansible.com/ansible/latest/modules/user_module.html?highlight=user
    user:
    name: '{{ item }}'
    append: yes
    groups: docker,sudo
    shell: /bin/bash
    state: present
    generate_ssh_key: yes
    ssh_key_bits: 2048
    # ssh_key_file: .ssh/id_rsa
    # 这里可以使用 with_items来做数组循环. https://docs.ansible.com/ansible/latest/user_guide/playbooks_loops.html?highlight=with_items
    with_items:
    - lcy
    - gavin_kou

    - name: 复制本的公钥到远程用户下.
    # 参考链接 https://docs.ansible.com/ansible/latest/modules/authorized_key_module.html
    authorized_key:
    user: lcy
    state: present
    # exclusive: True
    key: "{{ lookup('file', lookup('env','HOME') + '/.ssh/id_rsa.pub') }}\n"

    - name: Set authorized key for user ubuntu copying it from current user
    # 参考链接 https://docs.ansible.com/ansible/latest/modules/authorized_key_module.html
    authorized_key:
    user: gavin_kou
    state: present
    key: "{{ lookup('file','~/.ansible/gavin.pub') }}\n"
    # 参考链接 https://docs.ansible.com/ansible/latest/modules/lineinfile_module.html?highlight=sudoers
    - lineinfile:
    path: /etc/sudoers
    state: present
    regexp: '^%sudo\s'
    line: '%sudo ALL=(ALL) NOPASSWD: ALL'
    validate: '/usr/sbin/visudo -cf %s'

    # 关闭ssh的root登录功能.所以运行完这个剧本就不能使用root用户执行第二次了.
    - lineinfile:
    path: /etc/ssh/sshd_config
    state: present
    regexp: '^PermitRootLogin\s'
    line: 'PermitRootLogin no'

    - lineinfile:
    path: /etc/ssh/sshd_config
    state: present
    regexp: '^#ClientAliveInterval\s'
    line: 'ClientAliveInterval 30'

    - lineinfile:
    path: /etc/ssh/sshd_config
    state: present
    regexp: '^#ClientAliveCountMax\s'
    line: 'ClientAliveCountMax 3'
    # 参考消息 https://docs.ansible.com/ansible/latest/modules/systemd_module.html
    - name: 重加载sshd
    systemd:
    name: sshd
    state: reloaded

Docker的图形界面

  • docker 哪些平台技术(3)

  • UI For Docker已经deprecated,还是使用Portainer.

  • Portainer

    1
    ~$  docker run -d -p 9000:9000 --privileged -v /var/run/docker.sock:/var/run/docker.sock --name web-ui uifd/ui-for-docker
  • Portainer是一个开源、轻量级Docker管理用户界面,基于Docker API,提供状态显示面板、应用模板快速部署、容器镜像网络数据卷的基本操作(包括上传下载镜像,创建容器等操作)、事件日志显示、容器控制台操作Swarm集群和服务等集中管理和操作、登录用户管理和控制等功能.功能十分全面,基本能满足中小型单位对容器管理的全部需求.

    1
    ~$ docker run -d -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data  -v /etc/hosts:/etc/hosts --name portainer-ui portainer/portainer

添加其它Docker服务器到portainer

  • Protect the Docker daemon socketTLS链接参考这里.

  • Using the Portainer Agent

  • portainer添加外部Docker实例有三种模式,一种是dockerd -H tcp://192.168.xx.xx:2376,通过监听TCP socket方式.这里可以用通过TLS来保证安全.第二种就是在目标服务器上安装portaineragent.以上两种方法,创建TLS证书比较麻烦,安装agent会消耗一些资源.Docker18.09开始,客户端可以通过SSH访问:docker -H ssh://me@example.com ps,但是portainer不支持这种协议方式, 只支持两种协议:tcp://,unix://.

  • 下面介绍如何使用TLS证书连接远程的 Docker Engine.

    1
    2
    3
    4
    5
    6
    # 创建CA私钥
    ~$ openssl genrsa -aes256 -out ca.key 4096
    # 使和CA私钥创建CA证书.
    ~$ openssl req -new -x509 -days 365 -key ca.key -sha256 -out ca.pem
    # 创建服务器私钥匙
    ~$ openssl genrsa -out server.key 4096
  • 创建服务器CSR文件.注意CN(Common Name)如果是域名如:www.examples.com,则客户端必须过www.examples.com访问到该服务器.才能验证通过.TLS连接时,需要限制客户端的IP或者域名列表.可以用使用密钥扩展文件支持.如只允许127.0.0.1 和 192.168.1.100客户端访问.echo subhectAltName = IP:127.0.0.1,IP:192.168.1.100 > allowips.cnf,在下面创建服务器证书命令后加上-extfile allowips.cnf就可以支持了.

1
2
3
4
5
6
~$ openssl req -subj "/CN=*" -sha256 -new -key server.key -out server.csr
~$ openssl x509 -req -days 365 -sha256 -in server.csr -CA ca.pem -CAkey ca.key -CAcreateserial -out server.pem
Signature ok
subject=CN = *
Getting CA Private Key
Enter pass phrase for ca.key:

- 创建客户端私钥与CSR

1
2
3
4
5
6
~$ openssl genrsa -out client.key 4096
~$ openssl req -subj "/CN=*" -new -key client.key -out client.csr

# 如果需要客户端的验证就加入下面的扩展.
~$ echo extendedKeyUsage = clientAuth > client.cnf
~$ openssl x509 -req -days 365 -sha256 -in client.csr -CA ca.pem -CAkey ca.key -CAcreateserial -out client.pem -extfile client.cnf
  • 运行服务端与客户端测试.为保证公钥文件安全,需要修改公钥匙文件的访问权限:chmod -v 0400 ca.key server.key client.key.防止证书被篡改:
    chmod -v 0444 ca.pem server.pem client.pem.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    ~$ tree
    .
    ├── ca.key
    ├── ca.pem
    ├── ca.srl
    ├── client.key
    ├── client.pem
    ├── server.key
    └── server.pem
    # 服务端运行
    ~$ sudo dockerd --tlsverify --tlscacert=ca.pem --tlscert=server.pem --tlskey=server.key -H tcp://0.0.0.0:2376
    # 客户端运行.这里必须使用主机名连接,所以上述portainer-ui容器运行,使用 -v /etc/hosts:/etc/hosts 选项.
    ~$ docker --tlsverify --tlscacert=ca.pem --tlscert=client.pem --tlskey=client.key -H tcp://<主机名>:2376 version
    # Docker 从 18.09 开始支持通过SSH访问,这个更安全更便捷.
    ~$ docker -H ssh://user@domain.com version
    Client:
    Version: 18.09.1
    API version: 1.39
    Go version: go1.10.6
    Git commit: 4c52b90
    Built: Wed Jan 9 19:35:59 2019
    OS/Arch: linux/amd64
    Experimental: false

    Server: Docker Engine - Community
    Engine:
    Version: 18.09.1
    API version: 1.39 (minimum version 1.12)
    Go version: go1.10.6
    Git commit: 4c52b90
    Built: Wed Jan 9 19:02:44 2019
    OS/Arch: linux/amd64
    Experimental: false

修改服务端systemd service

  • 带证书启动的docker进程可以按照上述的手动运行的形式,也可以通过修改系统的systemed来开启,把服务端的证书文件复制到/etc/docker/下面.方法如下:
    1
    2
    3
    4
    5
    6
    ~$ cat /lib/systemd/system/docker.service
    [...]
    #ExecStart=/usr/bin/dockerd -H fd://  这是原来默认的,只开放本地连接.
    ExecStart=/usr/bin/dockerd --tlsverify --tlscacert=/etc/docker/ca.pem --tlscert=/etc/docker/server.pem --tlskey=/etc/docker/server.key -H tcp://0.0.0.0:2376 -H fd://
    [...]
    ~$ sudo systemctl daemon-reload #重新加载配置文件.

开启客户端默认证书模式

1
2
3
4
5
6
~$ mkdir -pv ~/.docker
~$ cp ca.pem ~/.docker
~$ cp client.key ~/.docker/key.pem # 一定要改这个名字
~$ cp client.pem ~/.docker/cert.pem # 一定要改这个名字
~$ export DOCKER_HOST=tcp://<HOST>:2376 DOCKER_TLS_VERIFY=1
~$ docker ps   # 证书模式连接

Docker错误

  • 如果出现如下的错误,要使用systemctl restart docker.service才能解决.
    1
    2
    docker: Error response from daemon: driver failed programming external connectivity on endpoint condescending_lalande (668389b4f87cc892fc233313eb738d0995c4080d3daabf28bf8c9bbe241a5434):  (iptables failed: iptables --wait -t filter -A DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.2 --dport 9000 -j ACCEPT: iptables: No chain/target/match by that name.
    (exit status 1)).

AnsibleDocker集成使用

  • 操作VOLUME

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    ~$ docker volume create redis_vol
    ~$ docker volume inspect redis_vol
    [
    {
    "CreatedAt": "2019-01-09T17:28:44+08:00",
    "Driver": "local",
    "Labels": {},
    "Mountpoint": "/var/lib/docker/volumes/redis_vol/_data",
    "Name": "redis_vol",
    "Options": {},
    "Scope": "local"
    }
    ]
    # 删除所有的卷
    ~$ docker volume prune
  • 必须安装docker-py,下面通过一个稍微复杂一点的例子来说明.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    # 如果是python3的话,有些模块会提示要换成安装: pip3 install docker
    ~$ pip install docker-py
    ~$ tree
    .
    ├── main.yaml
    ├── pgsql
    │ ├── file
    │ │ └── Dockerfile
    │ └── pgsql.yaml
    └── redis
    ├── file
    │ └── Dockerfile
    └── redis.yaml
  • main.yaml

    1
    2
    3
    4
    5
    6
    ---
    - hosts: DB001
    become: yes
    tasks:
    - include: pgsql/pgsql.yaml
    - include: redis/redis.yaml

Redis服务器

  • https://docs.docker.com/samples/library/redis/

  • https://github.com/docker-library/redis

  • Dockerfile

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    # 安装redis服务
    FROM debian:stretch
    # CMD echo "hello debian from Dockerfile."
    # https://www.digitalocean.com/community/tutorials/how-to-install-and-secure-redis-on-debian-9
    ENV DEBIAN_FRONTEND noninteractive

    RUN sed -i "s/deb.debian.org/mirrors.cloud.aliyuncs.com/g" /etc/apt/sources.list
    RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install redis-server
    # RUN sed -i 's/^appendonly no$/appendonly yes/g' /etc/redis/redis.conf
    # RUN sed -i 's/^daemonize yes$/daemonize no/g' /etc/redis/redis.conf
    EXPOSE 6379/tcp
    # CMD [ "/etc/init.d/redis-server start" ]
    USER root
    RUN rm -rf /data
    RUN mkdir /data && chown redis:redis -R /data
    VOLUME ["/data" ]
    RUN chown redis:redis -R /data
    CMD ["redis-server","/etc/redis/redis.conf"]
  • redis.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

- name: 创建WORKDIR
file:
path: workdir
state: directory
recurse: yes

- name: 上传Dockerfile到目标机的/tmp
# https://docs.ansible.com/ansible/latest/modules/synchronize_module.html?highlight=synchronize
synchronize:
src: redis/file/Dockerfile
dest: workdir

- name: Redis Server
# 参考地址 https://docs.ansible.com/ansible/latest/modules/docker_container_module.html?highlight=docker
docker_image:
name: redis
tag: v6
path: workdir
state: present

- name: 创建VOLUME的目录
# https://docs.ansible.com/ansible/latest/modules/file_module.html
become: yes
file:
path: ./redis-vol
state: directory
# 这里这了避免使用root运行docker,同时也避免使用root运行redis-server,把这个卷目录所有者改成docker里的redis的uid:gid的值,才能满足前述使用.这里为101
owner: 101
group: 101
recurse: yes

- name: create volume
# 这里使用 docker_volume 这个模块有问题,只能使下面的命令创建卷.创建卷不能指定路径,但是可以使用如: --opt type:ext4 --opt device=/dev/sdX 这个device必须是分区,块设备,不能是目录.
command: docker volume create redis_vol3

- name: Redis server
# 参考地址 https://docs.ansible.com/ansible/latest/modules/docker_container_module.html?highlight=docker
docker_container:
name: redis-server
# 对应上面创建的镜像.
image: 'redis:v6'
# command: redis-server --apendonly yes
state: started
recreate: yes
user: redis
# 设置密码登录,1234
command: redis-server --requirepass 1234 --appendonly yes --dir /data
published_ports:
- '6379:6379'
volumes:
- ./redis-vol:/data
  • 测试 Redis 服务.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
~$ redis-cli
127.0.0.1:6379> auth 1234
OK
127.0.0.1:6379> set dd 100
OK
127.0.0.1:6379> save # 写入磁盘.
OK
127.0.0.1:6379> exit

~$ redis-cli
127.0.0.1:6379> auth 1234
OK
127.0.0.1:6379> get dd
"100"
127.0.0.1:6379> config get dir # 查看系统的目录.
1) "dir"
2) "/data"
127.0.0.1:6379> quit

Postgresql数据库

  • 比较复杂的实例可以参考docker-postgresql,这个支持ENTRYPOINT参数,主从复制功能.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    ~$ cat Dockerfile
    # 参考这里 https://docs.docker.com/engine/examples/postgresql_service/#install-postgresql-on-docker
    FROM debian:stretch
    MAINTAINER lcy

    ENV DEBIAN_FRONTEND noninteractive
    # 添加这一行是为了避免下面的错误:
    # -----------------------------------------------
    # debconf: unable to initialize frontend: Dialog
    # debconf: (Dialog frontend will not work on a dumb terminal, an emacs shell buffer, or without a controlling terminal.)
    # debconf: falling back to frontend: Readline
    # debconf: unable to initialize frontend: Readline
    # debconf: (Can't locate Term/ReadLine.pm in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.14.2 /usr/local/share/perl/5.14.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.14 /usr/share/perl/5.14 /usr/local/lib/site_perl .) at /usr/share/perl5/Debconf/FrontEnd/Readline.pm line 7, <> line 19.)
    # debconf: falling back to frontend: Teletype
    # dpkg-preconfigure: unable to re-open stdin:
    #------------------------------------------------

    # RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
    # Add the PostgreSQL PGP key to verify their Debian packages.
    # It should be the same key as https://www.postgresql.org/media/keys/ACCC4CF8.asc
    # 这里因为部署在阿里云上,所以把软件仓库改成阿里云,加速下载,且免流量.
    RUN sed -i "s/deb.debian.org/mirrors.cloud.aliyuncs.com/g" /etc/apt/sources.list

    ENV PG_VERSION=10
    ENV PG_USER=postgres
    ENV PG_HOME=/var/lib/postgresql
    ENV PG_RUNDIR=/run/postgresql \
    PG_LOGDIR=/var/log/postgresql \
    PG_DATADIR=${PG_HOME}/${PG_VERSION}/main \
    PG_BINDIR=/usr/lib/postgresql/${PG_VERSION}/bin \
    PG_CONFIG=/etc/postgresql/${PG_VERSION}/main/postgresql.conf

    # 下面两个是必需安装才能,才能用apt-key添加公钥
    RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y apt-utils dirmngr gnupg2

    # CMD ["ping","-c","5","p80.pool.sks-keyservers.net"] 这里有可能会运行失败,如:读不到数据,没有解析的域名地址等.
    RUN apt-key adv --no-tty --keyserver ipv4.pool.sks-keyservers.net --recv-keys B97B0AFCAA1A47F044F244A07FCC7D46ACCC4CF8

    # Add PostgreSQL's repository. It contains the most recent stable release
    # of PostgreSQL, ``10``.
    RUN echo "deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main" > /etc/apt/sources.list.d/pgdg.list

    # Install ``python-software-properties``, ``software-properties-common`` and PostgreSQL 10.6
    # There are some warnings (in red) that show up during the build. You can hide
    # them by prefixing each apt-get statement with DEBIAN_FRONTEND=noninteractive

    RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y software-properties-common postgresql-10 postgresql-client-10 postgresql-contrib-10
    # 这里要注意文件路径的rwx权限问题.
    USER root
    RUN mkdir -p ${PG_DATADIR} && chown -R postgres:postgres ${PG_DATADIR}
    RUN mkdir -p ${PG_LOGDIR} && chown -R postgres:postgres ${PG_LOGDIR}

    # Adjust PostgreSQL configuration so that remote connections to the
    # database are possible.
    USER postgres
    RUN echo "host all all 0.0.0.0/0 md5" >> /etc/postgresql/10/main/pg_hba.conf

    # And add ``listen_addresses`` to ``/etc/postgresql/10.6/main/postgresql.conf``
    RUN echo "listen_addresses='*'" >> /etc/postgresql/10/main/postgresql.conf

    # Create a PostgreSQL role named ``docker`` with ``docker`` as the password and
    # then create a database `docker` owned by the ``docker`` role.
    # Note: here we use ``&&\`` to run commands one after the other - the ``\``
    # allows the RUN command to span multiple lines.
    RUN /etc/init.d/postgresql start && psql --command "CREATE USER docker WITH SUPERUSER PASSWORD 'docker';" && createdb -O docker docker

    # Expose the PostgreSQL port
    EXPOSE 5432/tcp
    # Add VOLUMEs to allow backup of config, logs and databases
    VOLUME ["/etc/postgresql", "/var/log/postgresql", "/var/lib/postgresql/10/main"]
    # RUN ${PG_BINDIR}/initdb -D ${PG_DATADIR}

    # Set the default command to run when starting the container
    # 这里如果是9.6版本那就要写成/usr/lib/postgresql/9.6/bin/postgres.好像10.6就直接如下面写就可以了.
    CMD ["/usr/lib/postgresql/10/bin/postgres", "-D", "/var/lib/postgresql/10/main", "-c", "config_file=/etc/postgresql/10/main/postgresql.conf"]
  • pgsql.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

- name: 创建WORKDIR
file:
path: workdir
state: directory
recurse: yes

- name: 上传Dockerfile到目标机的/tmp
# https://docs.ansible.com/ansible/latest/modules/synchronize_module.html?highlight=synchronize
synchronize:
src: pgsql/file/Dockerfile
dest: workdir

- name: 编译PostgreSQL Server镜像
# 参考地址 https://docs.ansible.com/ansible/latest/modules/docker_image_module.html?highlight=docker_image
# https://www.postgresql.org/download/linux/debian/
docker_image:
name: postgresql-10
tag: v6
path: workdir
# dockerfile: file/Dockerfile
state: present

- name: 创建VOLUME的目录
# https://docs.ansible.com/ansible/latest/modules/file_module.html
become: yes
file:
path: ./pgsql-vol/10/main
state: directory
# 这里这了避免使用root运行docker,同时也避免使用root运行postgresql,把这个卷目录所有者改成docker里的postgres的uid:gid的值,才能满足前述使用.这里为uid=111(postgres) gid=121(postgres) groups=121(postgres),120(ssl-cert)
owner: postgres
group: postgres
recurse: yes

# # https://docs.ansible.com/ansible/latest/modules/docker_volume_module.html?highlight=docker_volume
# - name: 创建数据容器卷
# docker_volume:
# name: pgdb_001
# driver_options:
# device: ./docker/volume/pgsql-vol

- name: 创建 postgresql 容器
# 参考地址 https://docs.ansible.com/ansible/latest/modules/docker_container_module.html?highlight=docker
# 这里可以等同于command模块的命令: ``command: docker run -p 5432:5432 -d --name pgsql7 --user postgres -v ~/pgsql-vol:/var/lib/postgresql postgresql-10:v6``
docker_container:
name: pgsql
image: 'postgresql-10:v6'
# docker 网络部分 https://docs.ansible.com/ansible/latest/modules/docker_network_module.html
# network_mode: host
dns_servers:
- '8.8.8.8'
- '100.100.2.136'
published_ports:
- '5432:5432'
state: started
# restart_policy: always
# detach: no
user: postgres
# https://www.katacoda.com/courses/docker/persisting-data-using-volumes
volumes:
- ./pgsql-vol:/var/lib/postgresql

Jenkins安装

谢谢支持

  • 微信二维码:

开发板简介

  • The STM32 Nucleo-144 F767ZI boards offer combinations of performance and power that provide an affordable and flexible
    way for users to build prototypes and try out new concepts. For compatible boards, the SMPS significantly reduces power
    consumption in Run mode.
  • The Arduino-compatible ST Zio connector expands functionality of the Nucleo open development platform, with a wide choice
    of specialized Arduino* Uno V3 shields.
  • The STM32 Nucleo-144 board does not require any separate probe as it integrates the ST-LINK/V2-1 debugger/programmer.
  • The STM32 Nucleo-144 board comes with the STM32 comprehensive free software libraries and examples available with the
    STM32Cube MCU Package.
  • Key Features
    • STM32 microcontroller in LQFP144 package
    • Ethernet compliant with IEEE-802.3-2002 (depending on STM32 support)
    • USB OTG or full-speed device (depending on STM32 support)
    • 3 user LEDs
    • 2 user and reset push-buttons
    • 32.768 kHz crystal oscillator
    • Board connectors:
      • USB with Micro-AB
      • SWD
      • Ethernet RJ45 (depending on STM32 support)
      • ST Zio connector including Arduino* Uno V3
      • ST morpho
    • Flexible power-supply options: ST-LINK USB VBUS or external sources.
    • On-board ST-LINK/V2-1 debugger/programmer with USB re-enumeration
    • capability: mass storage, virtual COM port and debug port.
    • Comprehensive free software libraries and examples available with the STM32Cube MCU package.
  • 根据文档The Cortex ® -M7 with FPU core is binary compatible with the Cortex ® -M4 core.说明 ,Cortex-M7内核比M4性更
    高,且二进制兼容.

Zephyr

系统简介

  • Zephyr 内核是一个内存占用极低的内核,它主要设计用于资源受限系统:从简单的嵌入式环境传感器、LED 可穿戴设备,到复杂的智能手表、IoT 无线网关.Zephyr 在被设计时就支持多架构,包括ARM Cortex-M,Intel x86,ARC,NIOS IIRISC V.使用Zephyr的一大优点是操作系统(OS)和软件开发工具包(SDK)可支持数百个开发板.

安装Zephyr环境

初始化工程

1
2
3
4
5
~$ pip3 install west
~$ west init ~/zephyrproject
~$ cd zephyrproject
~$ west update # 这里会通过west,同步大量的git库.
~$ west zephyr-export
  • 更新后的目录结构如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
~ zephyrproject$ tree -L 2
.
├── bootloader
│ └── mcuboot
├── modules
│ ├── bsim_hw_models
│ ├── crypto
│ ├── debug
│ ├── fs
│ ├── hal
│ ├── lib
│ └── tee
├── tools
│ ├── ci-tools
│ ├── edtt
│ └── net-tools
└── zephyr
├── arch
├── boards
├── cmake
├── CMakeLists.txt
├── CODE_OF_CONDUCT.md
├── CODEOWNERS
├── CONTRIBUTING.rst
├── doc
├── drivers
├── dts
├── include
├── Kconfig
├── Kconfig.zephyr
├── kernel
├── lib
├── LICENSE
├── MAINTAINERS.yml
├── Makefile
├── misc
├── modules
├── README.rst
├── samples
├── scripts
├── share
├── soc
├── subsys
├── tests
├── VERSION
├── version.h.in
├── west.yml
├── zephyr-env.cmd
└── zephyr-env.sh

32 directories, 15 files
  • 安装大量的Python依赖库
    1
    ~ zephyrproject$  pip install -r ./zephyr/scripts/requirements.txt

安装工具链

1
2
3
~$ wget https://github.com/zephyrproject-rtos/sdk-ng/releases/download/v0.11.4/zephyr-sdk-0.11.4-setup.run
~$ chmod +x zephyr-sdk-0.11.4-setup.run
~$ ./zephyr-sdk-0.11.4-setup.run -- -d ~/zephyr-sdk-0.11.4
  • 解压安装完成后,会有一个~/.zephyrrc的环境变量文件.下面安装udev文件.
    1
    2
    ~$ sudo cp ~/zephyr-sdk-0.11.4/sysroots/x86_64-pokysdk-linux/usr/share/openocd/contrib/60-openocd.rules  /etc/udev/rules.d/
    ~$ sudo udevadm control --reload

板级示例程序

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
~$  west build -p auto  -b nucleo_f767zi samples/hello_world
[......]
-- west build: building application
[1/131] Preparing syscall dependency handling

[126/131] Linking C executable zephyr/zephyr_prebuilt.elf
Memory region Used Size Region Size %age Used
FLASH: 13448 B 2 MB 0.64%
DTCM: 0 GB 128 KB 0.00%
SRAM: 4432 B 384 KB 1.13%
IDT_LIST: 200 B 2 KB 9.77%
[131/131] Linking C executable zephyr/zephyr.elf
# 查看编译目录的文件结构.
~$ ls build/zephyr/
arch drivers kconfig linker.cmd.dep nucleo_f767zi.dts.pre.d zephyr.bin zephyr.map
boards edt.pickle kernel linker_pass_final.cmd nucleo_f767zi.dts.pre.tmp zephyr.dts zephyr_prebuilt.elf
cmake include lib linker_pass_final.cmd.dep runners.yaml zephyr.elf zephyr_prebuilt.map
CMakeFiles isrList.bin libzephyr.a misc soc zephyr.hex zephyr.stat
cmake_install.cmake isr_tables.c linker.cmd nucleo_f767zi.dts_compiled subsys zephyr.lst
  • 连接到目标板子的USB,使用下面命令烧写入示例.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
~$ west flash
Open On-Chip Debugger 0.10.0+dev-01341-g580d06d9d-dirty (2020-06-25-12:07)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
Info : clock speed 2000 kHz
Info : STLINK V2J23M7 (API v2) VID:PID 0483:374B
Info : Target voltage: 3.245850
Info : stm32f7x.cpu: hardware has 8 breakpoints, 4 watchpoints
Info : Listening on port 3333 for gdb connections
TargetName Type Endian TapName State
-- ------------------ ---------- ------ ------------------ ------------
0* stm32f7x.cpu hla_target little stm32f7x.cpu running

Info : Unable to match requested speed 2000 kHz, using 1800 kHz
Info : Unable to match requested speed 2000 kHz, using 1800 kHz
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08000268 msp: 0x20020e8c
Info : device id = 0x10006451
Info : flash size = 2048 kbytes
Info : Single Bank 2048 kiB STM32F76x/77x found
auto erase enabled
wrote 32768 bytes from file /fullpath/zephyrproject/zephyr/build/zephyr/zephyr.hex in 1.283166s (24.938 KiB/s)
  • 通过USB连接的/dev/ttyACM0会看到如下输出.
1
2
3
~$ sudo minicom -o -b 115200 -D /dev/ttyACM0
*** Booting Zephyr OS build v2.4.0-rc3-10-g0a0cb52fb229 ***
Hello World! nucleo_f767zi

板级调试

1
2
3
4
5
6
7
~$ west debug
-- west debug: rebuilding
[0/1] cd /fullpath/zephyrproject/zephyr/build/zephyr/cmake/flash && /usr/bin/cmake -E echo

-- west debug: using runner openocd
/fullpath/zephyr-sdk-0.11.4/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb: error while loading shared libraries: libpython3.8.so.1.0: cannot open shared object file: No such file or directory
FATAL ERROR: command exited with status 127: /fullpath/zephyr-sdk-0.11.4/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb -ex 'target remote :3333' /fullpath/zephyrproject/zephyr/build/zephyr/zephyr.elf
  • 如上面错误,是因为这里是使用pyenv安装的python-3.8.2,不在系统内的搜索路径里,而且默认还是静态(static)编译的,所以会出显上面的错误.下面来重装它,并且使用LD_LIBRARY_PATH指定路径.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
~$ CONFIGURE_OPTS=--enable-shared pyenv install 3.8.2
pyenv: /home/michael/.pyenv/versions/3.8.2 already exists
continue with installation? (y/N) y
Installing Python-3.8.2...
Installed Python-3.8.2 to /home/michael/.pyenv/versions/3.8.2

~$ tree -L 1 /home/michael/.pyenv/versions/3.8.2/lib
/home/michael/.pyenv/versions/3.8.2/lib
├── libpython3.8.a
├── libpython3.8.so -> libpython3.8.so.1.0
├── libpython3.8.so.1.0
├── libpython3.so
├── pkgconfig
└── python3.8

2 directories, 4 files

  • 再次运行调试.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
~$ LD_LIBRARY_PATH=/home/michael/.pyenv/versions/3.8.2/lib west debug
-- west debug: rebuilding
[....]
This GDB was configured as "--host=x86_64-build_pc-linux-gnu --target=arm-zephyr-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
[...]
Reading symbols from /fullpath/zephyrproject/zephyr/build/zephyr/zephyr.elf...
Info : stm32f7x.cpu: hardware has 0 breakpoints, 10 watchpoints
Info : Listening on port 3333 for gdb connections
TargetName Type Endian TapName State
-- ------------------ ---------- ------ ------------------ ------------
0* stm32f7x.cpu hla_target little stm32f7x.cpu halted

Info : Listening on port 6333 for tcl connections
Info : Listening on port 4444 for telnet connections
Remote debugging using :3333
Info : accepting 'gdb' connection on tcp/3333
Debugger attaching: halting execution
Info : Unable to match requested speed 2000 kHz, using 1800 kHz
Info : Unable to match requested speed 2000 kHz, using 1800 kHz
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08001004 msp: 0x20020810
force hard breakpoints
Info : device id = 0x10006451
Info : flash size = 2048 kbytes
Info : Single Bank 2048 kiB STM32F76x/77x found
Info : flash size = 1024 bytes
0x08001004 in z_vprintk (out=0x0, ctx=0x0, fmt=0x0, ap=...) at /fullpath/zephyrproject/zephyr/lib/os/printk.c:292
292 while (*s) {
(gdb)

测试非标准板子(STM32F030 DEMO BOARD)

  • STM32F030 DEMO BOARD
  • 我手里的这块stm32f030_demo是没有STLINK-V2调试器的,但是它有引出GND,VCC,DIO,CLK这样一组接口,刚才好手上有一个JLink-OB也是这样的接口,下面就用它来烧写与调试.
1
~$ west build -b stm32f030_demo samples/basic/blinky
  • 配置openocd连接参数.

    1
    2
    3
    4
    5
    ~$ cat > jlink-ob-stm32f0.cfg<<EOF
    source [find interface/jlink.cfg]
    transport select swd
    source [find target/stm32f0x.cfg]
    EOF
  • 使用openocd烧写.

    1
    2
    ~$ cd build/zephyr
    ~$ openocd -f ~/jlink-ob-stm32f0.cfg -c init -c "reset halt" -c "stm32f0x mass_erase 0" -c "flash write_bank 0 zephyr.bin 0" -c "reset run"

Nuttx

系统简介

  • NuttX是一个专注于标准合规和小内存占用的实时操作系统(RTOS).它可以在8位到32位的微控制器上部署.NuttX在编写时主要参照了POSIXANSI标准.对于那些标准中没有的部分,如fork()等,则参考了VxWorks或其他RTOS.NuttX基本上完全是由C语言实现的,并且通过Kconfig生成GNU makefile.NuttX的发行版包括了NuttX内核本身和相当一部分的中间件和板级支持包.NuttX的内核和绝大多数代码完全是由Gregory Nutt完成的,并由他专门维护.所有的社区贡献都必须经过他批准.NuttX最早是在2007年由Gregory Nutt于BSD协议下释出的.

使用BuildRoot构建工具链(非必要)

  • 如果要使用BuildRoot定制编译的工具链就是如下配置:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    ~$ git clone https://bitbucket.org/nuttx/buildroot.git buildroot
    ~$ cp configs/cortexm7f-eabi-defconfig-7.4.0 .config
    ~$ make menuconfig
    # 终配置如下
    ~$ grep -v '^$\|^#' .config
    BR2_HAVE_DOT_CONFIG=y
    BR2_arm=y
    BR2_cortex_m7f=y
    BR2_GCC_CORTEX=y
    BR2_GCC_CORTEX_M7F=y
    BR2_ARM_EABI=y
    BR2_ARCH="arm"
    BR2_GCC_TARGET_TUNE="cortex-m7"
    BR2_GCC_TARGET_ARCH="armv7-m"
    BR2_GCC_TARGET_ABI="aapcs-linux"
    BR2_WGET="wget --passive-ftp"
    BR2_SVN="svn co"
    BR2_ZCAT="zcat"
    BR2_BZCAT="bzcat"
    BR2_TAR_OPTIONS=""
    BR2_DL_DIR="$(BASE_DIR)/../archives"
    BR2_STAGING_DIR="$(BUILD_DIR)/staging_dir"
    BR2_NUTTX_DIR="$(TOPDIR)/../nuttx"
    BR2_TOPDIR_PREFIX=""
    BR2_TOPDIR_SUFFIX=""
    BR2_GNU_BUILD_SUFFIX="pc-elf"
    BR2_GNU_TARGET_SUFFIX="nuttx-eabi"
    BR2_PACKAGE_BINUTILS=y
    BR2_BINUTILS_VERSION_2_28_1=y
    BR2_BINUTILS_SUPPORTS_NUTTX_OS=y
    BR2_BINUTILS_VERSION="2.28.1"
    BR2_EXTRA_BINUTILS_CONFIG_OPTIONS=""
    BR2_PACKAGE_GCC=y
    BR2_GCC_VERSION_7_4_0=y
    BR2_GCC_SUPPORTS_SYSROOT=y
    BR2_GCC_SUPPORTS_NUTTX_OS=y
    BR2_GCC_SUPPORTS_DOWN_PREREQ=y
    BR2_GCC_DOWNLOAD_PREREQUISITES=y
    BR2_GCC_VERSION="7.4.0"
    BR2_EXTRA_GCC_CONFIG_OPTIONS=""
    BR2_INSTALL_LIBSTDCPP=y
    BR2_PACKAGE_GDB_HOST=y
    BR2_GDB_VERSION_8_0_1=y
    BR2_PACKAGE_GDB_TUI=y
    BR2_GDB_VERSION="8.0.1"
    BR2_PACKAGE_NXFLAT=y
    BR2_PACKAGE_GENROMFS=y
    BR2_PACKAGE_KCONFIG_FRONTENDS=y
    BR2_KCONFIG_VERSION_4_11_0_1=y
    BR2_KCONFIG_FRONTENDS_VERSION="4.11.0.1"
    BR2_LARGEFILE=y
    BR2_SOFT_FLOAT=y
    BR2_TARGET_OPTIMIZATION="-Os -pipe"

  • 编译成功后,会在buildroot生成一个目录,结构如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    ~$ tree -L 2 build_arm_hf/
    build_arm_hf/
    ├── root
    └── staging_dir
    ├── arm-elf -> arm-nuttx-eabi
    ├── arm-nuttx-eabi
    ├── bin
    ├── include
    ├── lib
    ├── libexec
    ├── share
    └── usr

    10 directories, 0 files

编译

  • nuttx Wiki
  • 把上面工具链的绝对路径加入Shell环境变量中使用,也可以使用第三方的工具链,如:Zephyrarm-zephyr-eabi,也可以使用如:gcc-arm-none-eabi-6-2017-q2-update的工具链,在Debian下可以安装系统自带的gcc-arm-none-eabi包.在make menuconfig时,选择CONFIG_ARMV7M_TOOLCHAIN_GNU_EABIL=y.
1
2
3
4
~$ tools/configure.sh -L | grep "f767"
nucleo-144:f767-netnsh
nucleo-144:f767-nsh
nucleo-144:f767-evalos
  • 因为NULEO-F767ZI是有一个以太网接口,这里选择使用nucleo-144:f767-netnsh配置文件.

    1
    2
    3
    4
    ~$ tools/configure.sh  nucleo-144:f767-netnsh
    ~$ make oldconfig
    ~$ make menuconfig
    ~$ make # 如果是使用第三方工具链,就如这样: make CROSSDEV=arm-none-eabi-
  • 这里最终测试的系统配置如下:

ESP8266通信

  • 这里是使用板上的USART6通信,而USART3默认是做为系统的CONSOLE的接口,

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    CONFIG_STM32F7_USART6=y
    CONFIG_USART6_SERIALDRIVER=y
    CONFIG_DEV_CONSOLE=y
    CONFIG_NSH_CONSOLE=y
    CONFIG_SERIAL_CONSOLE=y
    CONFIG_USART3_SERIAL_CONSOLE=y
    CONFIG_NUCLEO_CONSOLE_VIRTUAL=y
    CONFIG_NETUTILS_ESP8266_DEV_PATH="/dev/ttyS1"

    # 安装CU命令,就是类似于putty,minicom这样的串口通信工具.
    CONFIG_SYSTEM_CUTERM=y
    CONFIG_SYSTEM_CUTERM_DEFAULT_DEVICE="/dev/ttyS0"
    CONFIG_SYSTEM_CUTERM_DEFAULT_BAUD=115200
    CONFIG_SYSTEM_CUTERM_STACKSIZE=2048
    CONFIG_SYSTEM_CUTERM_PRIORITY=100

  • 同时需要在boards/arm/stm32f7/nucleo-144/include/board.h添加USART6的接口定义.

1
2
3
4
[...]
# define GPIO_USART6_RX GPIO_USART6_RX_2
# define GPIO_USART6_TX GPIO_USART6_TX_2
[...]
  • 连接ESP8266

    1
    2
    3
    4
    5
    6
    7
    STM32F767ZI-NUCLEO          ESP01

    D0 RX ---> TX
    D1 TX ---> RX
    GND ---> GND
    3V3 ---> 3V3
    3V3 ---> CH_PD # 拉高进入正常模式
  • 连接测试

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    nsh> cu -s 115200 -l /dev/ttyS1
    AT

    OK
    AT+GMR
    AT version:1.7.4.0(May 11 2020 19:13:04)
    SDK version:3.0.4(9532ceb)
    compile time:May 27 2020 10:12:17
    Bin version(Wroom 02):1.7.4
    OK

    AT+SYSRAM?
    +SYSRAM:51952

    OK

外接SPI-SD读卡器

  • 这里使用SPI3:(PB3:CLK),(PB4:MISO),(PB5:MOSI)来外接SD读卡器,根据数据手册stm32f767zi.pdf,NSS(CS)片选可以是软件定义也可以使它硬件(PA4:GPIO_SPI3_NSS_2)定义的,因为这里的使用场景非常简单,就是用PA4来做从机的CS片选.
  • nuttx的系统内boards/arm/stm32f7/nucleo-144/src没有mmcsd相关的文件,这里可以有从其它板子的
    stm32_mmcsd.c文件复制修改一下就可以使用.
  • boards/arm/stm32f7/nucleo-144/src/nucle-144.h定义CS.
1
2
3
4
5
6
7
[...]
#define GPIO_SPI_CS (GPIO_OUTPUT | GPIO_PUSHPULL | GPIO_SPEED_50MHz | \
GPIO_OUTPUT_SET)
#if defined(CONFIG_MMCSD_SPI)
#define GPIO_SPI3_CARD_CS (GPIO_SPI_CS | GPIO_PORTA | GPIO_PIN4)
#endif
[...]
  • boards/arm/stm32f7/nucleo-144/src/stm32_mmcsd.c内定义函数的实现.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
[....]
#ifdef CONFIG_STM32F7_SPI3
int stm32_spi3register(struct spi_dev_s *dev, spi_mediachange_t callback,
void *arg)
{
/* TODO: media change callback */
return OK;
}
#endif

/*****************************************************************************
* Name: stm32_mmcsd_initialize
*
* Description:
* Initialize SPI-based SD card and card detect thread.
****************************************************************************/

int stm32_mmcsd_initialize(int port, int minor)
{
struct spi_dev_s *spi;
int rv;

stm32_configgpio(GPIO_SPI3_CARD_CS); /* Assign CS */
stm32_gpiowrite(GPIO_SPI3_CARD_CS, 1); /* Ensure the CS is inactive */

mcinfo("INFO: Initializing mmcsd port %d minor %d \n",
port, minor);

spi = stm32_spibus_initialize(port);
if (spi == NULL)
{
mcerr("ERROR: Failed to initialize SPI port %d\n", port);
return -ENODEV;
}

rv = mmcsd_spislotinitialize(minor, minor, spi);
if (rv < 0)
{
mcerr("ERROR: Failed to bind SPI port %d to SD slot %d\n",
port, minor);
return rv;
}

spiinfo("INFO: mmcsd card has been initialized successfully\n");
return OK;
}
  • 修改文件boards/arm/stm32f7/nucleo-144/src/stm32_spi.c内的相关函数内容.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    [...]
    #ifdef CONFIG_STM32F7_SPI3
    void stm32_spi3select(FAR struct spi_dev_s *dev, uint32_t devid, bool selected)
    {
    spiinfo("devid: %d CS: %s\n", (int)devid, selected ? "assert" : "de-assert");
    #if defined(CONFIG_MMCSD_SPI)
    stm32_gpiowrite(GPIO_SPI3_CARD_CS, !selected);
    #endif
    }

    uint8_t stm32_spi3status(FAR struct spi_dev_s *dev, uint32_t devid)
    {
    uint8_t ret = 0;
    #if defined(CONFIG_MMCSD_SPI)
    if (devid == SPIDEV_MMCSD(0))
    {
    /* Note: SD_DET is pulled high when there\'s no SD card present. */
    /* 因为读卡没CD脚(插卡检测),或者不用该功能,就必须返回卡已经插入的条件.
    * 凭此条件去进行下一步,读卡写卡操作,很重要. */

    ret |= SPI_STATUS_PRESENT;
    }
    #endif
    return ret;
    }
  • 在系统初始化的流程内加入MMCSD_SPI的初始流程.修改文件boards/arm/stm32f7/nucleo-144/src/stm32_appinitialize.c.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    [...]
    #ifdef CONFIG_MMCSD_SPI
    /* Initialize the MMC/SD SPI driver (SPI2 is used) */

    ret = stm32_mmcsd_initialize(3, CONFIG_NSH_MMCSDMINOR);
    if (ret < 0)
    {
    syslog(LOG_ERR, "Failed to initialize SD slot %d: %d\n",
    CONFIG_NSH_MMCSDMINOR, ret);
    }
    #endif
    [....]
  • 最后是修改boards/arm/stm32f7/nucleo-144/src/Makefile文件,加入如下内容.

    1
    2
    3
    4
    5
    [...]
    ifeq ($(CONFIG_MMCSD_SPI),y)
    CSRCS += stm32_mmcsd.c
    endif
    [...]

QSPI驱动(未测试成功)

  • 不知为何,工程文件内没有定义QSPI的管脚.这里只是通过编译,还未成功读写它.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    ~$ grep "/* QSPI"  boards/arm/stm32f7/nucleo-144/include/board.h -A 20
    /* QSPI
    *
    * reference from UM1974 chapter 6.14
    * stm32f7/hardware/stm32f76xx77xx_pinmap.h
    *
    * PB6 GPIO_QSPI_CS CN10-13
    * PB2 GPIO_QSPI_SCK CN10-15
    * PD11 GPIO_QSPI_IO0 CN10-23
    * PD12 GPIO_QSPI_IO1 CN10-21
    * PE2 GPIO_QSPI_IO2 CN10-25
    * PD13 GPIO_QSPI_IO3 CN10-19
    *
    * */
    #define GPIO_QSPI_CS GPIO_QUADSPI_BK1_NCS_1
    #define GPIO_QSPI_SCK GPIO_QUADSPI_CLK_1
    #define GPIO_QSPI_IO0 GPIO_QUADSPI_BK1_IO0_3
    #define GPIO_QSPI_IO1 GPIO_QUADSPI_BK1_IO1_3
    #define GPIO_QSPI_IO2 GPIO_QUADSPI_BK1_IO2_1
    #define GPIO_QSPI_IO3 GPIO_QUADSPI_BK1_IO3_2


烧写与测试

  • 编译成功后nuttx内会有nuttx,nuttx.bin,nuttx.hex三个文件,通过openocd烧写与调试目标板子.

    1
    ~$ sudo openocd -f board/stm32f7discovery.cfg -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000"
  • 连接网卡与USB接口,使用minicom连接串口,进入系统.有时引同会长时间无法初始化,尝试按一下板子上的B1 USER按键.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    ~$ minicom -o -b 115200 -D /dev/ttyACM0

    nsh> help
    help usage: help [-v] [<cmd>]

    . cat df hexdump mkdir mw set umount
    [ cd echo ifconfig mkfatfs nslookup sleep unset
    ? cp exec ifdown mkfifo ps source usleep
    addroute cmp exitifup mkrd pwd test wget
    arp dirname false kill mh rm time xd
    basename dd free ls mount rmdir true
    break delroute help mb mv route uname

    Builtin Apps:
    ping6 renew ntpcstop sh
    ntpcstart ping mm nsh

Iotjs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
~$ git clone https://github.com/Samsung/iotjs.git
~$ ls
apps buildroot iotjs nuttx
# 须要先构建它.
~$ cd iotjs && tools/build.py --target-arch=arm --target-os=nuttx --nuttx-home=/fullpath/nuttx --target-board=stm32f7nucleo --jerry-heaplimit=78
==> Initialize submodule

git submodule init

Submodule 'deps/http-parser' (https://github.com/Samsung/http-parser.git) registered for path 'deps/http-parser'
Submodule 'deps/jerry' (https://github.com/jerryscript-project/jerryscript.git) registered for path 'deps/jerry'
Submodule 'deps/libtuv' (https://github.com/Samsung/libtuv.git) registered for path 'deps/libtuv'
Submodule 'deps/mbedtls' (https://github.com/ARMmbed/mbedtls.git) registered for path 'deps/mbedtls'
git submodule update
  • 编译中出现下面的错误,是因为头文件里的一些条件宏定义的结果,也就是说在nuttx系统内没有选择CONFIG_SERIAL_TERMIOS=y,所以才会导至error: field 'orig_termios' has incomplete type.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    In file included from /fullpath/iotjs/deps/libtuv/include/uv.h:77:0,
    from /fullpath/iotjs/deps/libtuv/src/fs-poll.c:22:
    /fullpath/iotjs/deps/libtuv/include/uv-unix.h:428:18: error: field 'orig_termios' has incomplete type
    struct termios orig_termios; \
    ^
    /fullpath/iotjs/deps/libtuv/include/uv.h:680:3: note: in expansion of macro 'UV_TTY_PRIVATE_FIELDS'
    UV_TTY_PRIVATE_FIELDS
    ^
    make[5]: *** [CMakeFiles/tuv.dir/build.make:63: CMakeFiles/tuv.dir/src/fs-poll.c.obj] Error 1
    make[4]: *** [CMakeFiles/Makefile2:73: CMakeFiles/tuv.dir/all] Error 2
    make[3]: *** [Makefile:130: all] Error 2
    make[2]: *** [CMakeFiles/libtuv.dir/build.make:111: deps/libtuv/src/libtuv-stamp/libtuv-build] Error 2
    make[1]: *** [CMakeFiles/Makefile2:184: CMakeFiles/libtuv.dir/all] Error 2
    make: *** [Makefile:130: all] Error 2
  • apps/system内,创建一个iotjs的目录,具的app内容是来自于STM32F4下面的.

    1
    2
    ~$ mkdir apps/system/iotjs
    ~$ cp iotjs/config/nuttx/stm32f4dis/app/* apps/system/iotjs
  • 必须要在app/system/Kconfig加上一条source "/<fullpath>/apps/system/iotjs/Kconfig"记录.

使用串口与ESP8266通信

  • 在使用ESP8266时,要注意配置几个必须项,这里使用板上的UART4来通信.
    1
    2
    3
    -> System Type -> STM32 Peripheral Support -> [*] UART4
    -> Application Configuration -> Network Utilities -> [*] ESP8266
    -> Application Configuration -> System Libraries and NSH Add-Ons -> [*] CU minimal serial terminal
  • UART4定义缺失的错误
    1
    2
    3
    4
    5
    6
    7
    8
    9
    CC:  chip/stm32_serial.c
    chip/stm32_serial.c:1037:20: error: 'GPIO_UART4_TX' undeclared here (not in a function)
    .tx_gpio = GPIO_UART4_TX,
    ^
    chip/stm32_serial.c:1038:20: error: 'GPIO_UART4_RX' undeclared here (not in a function)
    .rx_gpio = GPIO_UART4_RX,
    ^
    make[1]: *** [Makefile:154: stm32_serial.o] Error 1

STM32F103最小系统

  • 这里使用的最小板系统,标准JTAG与大部分管脚能引出来,清晰明了,这里重点参照nuttx板子内的README.md.但是这里在使用USB开启CDC/ACM串口时没成功.
1
~$ tools/configure.sh  stm32f103-minimum:usbnsh
  • 看了这个系统内的文档说明,STM32F103C8T6的内部flash128KB而不是它数据文档所说的64KB,所下这里修改ld文件如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    ~$ cat boards/arm/stm32/stm32f103-minimum/scripts/ld.script
    [....]
    /* The STM32F103C8T6 has 64Kb of FLASH beginning at address 0x0800:0000 and
    * 20Kb of SRAM beginning at address 0x2000:0000. When booting from FLASH,
    * FLASH memory is aliased to address 0x0000:0000 where the code expects to
    * begin execution by jumping to the entry point in the 0x0800:0000 address
    * range.
    *
    * NOTE: While the STM32F103C8T6 states that the part has 64Kb of FLASH,
    * all parts that I have seen do, in fact, have 128Kb of FLASH. That
    * additional 64Kb of FLASH can be utilized by simply change the LENGTH
    * of the flash region from 64K to 128K.
    */

    MEMORY
    {
    flash (rx) : ORIGIN = 0x08000000, LENGTH = 128K
    sram (rwx) : ORIGIN = 0x20000000, LENGTH = 20K
    }
    [...]
  • 配置openocd通过Jlink-OB来烧写调试.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    ~$ cat ~/jlink-ob-stm32f1-swd.cfg
    source [find interface/jlink.cfg]
    transport select swd
    source [find target/stm32f1x.cfg]

    # 烧写
    ~$ openocd -f ~/jlink-ob-stm32f1-swd.cfg -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000" -c "reset run"
    Open On-Chip Debugger 0.10.0+dev-01423-g3ffa14b04-dirty (2020-10-14-08:59)
    Licensed under GNU GPL v2
    For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
    Info : J-Link ARM-OB STM32 compiled Aug 22 2012 19:52:04
    Info : Hardware version: 7.00
    Info : VTarget = 3.300 V
    Info : clock speed 1000 kHz
    Info : SWD DPIDR 0x1ba01477
    Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints
    Info : starting gdb server for stm32f1x.cpu on 3333
    Info : Listening on port 3333 for gdb connections
    target halted due to debug-request, current mode: Thread
    xPSR: 0x01000000 pc: 0x08000130 msp: 0x20000f54
    Info : device id = 0x20036410
    Info : flash size = 128kbytes
    auto erase enabled
    wrote 78848 bytes from file nuttx.bin in 7.988087s (9.639 KiB/s)

  • 配置openocd通过其它带有ST-Link调试板子来烧写调试,一般的官方评估板都会带有调试器如,NUCLEO的系列,这里是使用NUCLEO-L152RE板上的ST-LINK来烧写调试.首先,把NUCLEO-L152RE板上CN2跳线取掉,根据官方文档(SWD)CN4的端线序是1:VDD,2:SWCLK,3:GND,4:SWDIO,5:NRST,6:SWO,这里只需要三根连接既可,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
      SWD         STM32F103C8T6
PIN2 SWCLK ----> P14 SWCLK
PIN3 GND ----> GND
PIN4 SWDIO ----> P13 SWDIO

~$ openocd -f interface/stlink.cfg -f target/stm32f1x.cfg -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000" -c "reset run"
Open On-Chip Debugger 0.10.0+dev-01423-g3ffa14b04-dirty (2020-10-14-08:59)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "hla_swd". To override use 'transport select <transport>'.
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
Info : clock speed 1000 kHz
Info : STLINK V2J22M5 (API v2) VID:PID 0483:374B
Info : Target voltage: 3.262028
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for stm32f1x.cpu on 3333
Info : Listening on port 3333 for gdb connections
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08000130 msp: 0x20000d40
Info : device id = 0x20036410
Info : flash size = 128kbytes
auto erase enabled
wrote 109568 bytes from file nuttx.bin in 5.901268s (18.132 KiB/s)
  • 因为STM32F103C8T6的内存只有20K,所以很容易就爆了,最经典的错误就是无法运行内置app,如下所示:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
NuttShell (NSH) NuttX-9.1.0
nsh> ?
help usage: help [-v] [<cmd>]

. cmp exit kill mount rmdir true
[ dirname export ls mv rmmod uname
? date false lsmod mw set umount
basename dd free mb printf sleep unset
cat df help mkdir ps source usleep
cd echo hexdump mksmartfs pwd test xd
cp exec insmod mh rm time

Builtin Apps:
chat sh hello spi nsh cu
nsh> free
total used free largest
Umem: 17072 15272 1800 1736

nsh> spi
nsh: spi: command not found
  • 原来是因为,内存不够,无法把程序从FLASH复制到内存中运行,所以会出现上面错误,如果打开编译调试错误出,会从nsh shell得到更进一步的错误信息详情,参考.
  • 我这里处理的方法就是把各种线程栈(STACKSIZE)的体积缩小到最大只能是1024,如上所示,内存只有1800,但是编译时设置的spi toolSTACKSIZE2048.

连接ESP8266

  • 这里是使用USART3,并且是设置:
    1
    -> System Type -> U[S]ART Configuration -> Serial Driver Configuration -> [*] Disable reordering of ttySx devices.
    ESP8266的串口路径是/dev/ttyS1. 连接图如下.
1
2
3
4
5
6
7
  STM32            ESP8266
PB10 TX3 ------> RX
PB11 RX3 ------> TX
3V3 ------> CH_PD # 这里应该使用一个GPIO来驱动它.
3V3 ------> 3V3
GND ------> GND

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
nsh> ?
help usage: help [-v] [<cmd>]

. cmp exit kill mount rmdir true
[ dirname export ls mv rmmod uname
? date false lsmod mw set umount
basename dd free mb printf sleep unset
cat df help mkdir ps source usleep
cd echo hexdump mksmartfs pwd test xd
cp exec insmod mh rm time

Builtin Apps:
chat sh spi nsh cu
nsh> cu -s 115200 -l /dev/ttyS1
AT

OK
AT+GMR
AT version:1.7.4.0(May 11 2020 19:13:04)
SDK version:3.0.4(9532ceb)
compile time:May 27 2020 10:12:17
Bin version(Wroom 02):1.7.4
OK

读写SD卡(over SPI)

  • 这里只能使用SPI1做为SD接口,因为在boards/arm/stm32/stm32f103-minimum/src/stm32_mmcsd.c内在已经硬编码如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    /*****************************************************************************
    * Pre-processor Definitions
    ****************************************************************************/

    #ifndef CONFIG_STM32_SPI1
    # error "SD driver requires CONFIG_STM32_SPI1 to be enabled"
    #endif

    #ifdef CONFIG_DISABLE_MOUNTPOINT
    # error "SD driver requires CONFIG_DISABLE_MOUNTPOINT to be disabled"
    #endif

    /*****************************************************************************
    * Private Definitions
    ****************************************************************************/

    static const int SD_SPI_PORT = 1; /* SD is connected to SPI1 port */
    static const int SD_SLOT_NO = 0; /* There is only one SD slot */

  • 基本的必须配置项如下

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    CONFIG_STM32_SPI=y
    CONFIG_STM32_SPI1=y
    CONFIG_SPI=y
    CONFIG_SPI_EXCHANGE=y
    CONFIG_SPI_DRIVER=y
    CONFIG_MMCSD_SPI=y
    CONFIG_MMCSD_SPICLOCK=20000000
    CONFIG_MMCSD_SPIMODE=0

    CONFIG_MMCSD=y
    CONFIG_MMCSD_NSLOTS=1
    CONFIG_MMCSD_SPI=y
    CONFIG_MMCSD_SPICLOCK=20000000
    CONFIG_MMCSD_SPIMODE=0
    CONFIG_MMCSD_IDMODE_CLOCK=400000
    CONFIG_NSH_MMCSDMINOR=0
    CONFIG_NSH_MMCSDSLOTNO=0
    CONFIG_NSH_MMCSDSPIPORTNO=1

    CONFIG_FS_FAT=y
    CONFIG_FAT_LCNAMES=y
    CONFIG_FAT_LFN=y
    CONFIG_FAT_MAXFNAME=32
    CONFIG_FAT_LFN_ALIAS_TRAILCHARS=0
    CONFIG_FSUTILS_MKFATFS=y

读写W25Q32FV

  • 通过查看代码发现,W25Q32FV也是使用SPI1默认也是硬编码写,这里是默认使用SPI1是的常规必须选项.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    CONFIG_STM32_SPI1=y
    CONFIG_STM32_SPI=y

    CONFIG_MTD_W25=y
    CONFIG_W25_SPIMODE=0
    CONFIG_W25_SPIFREQUENCY=20000000

    CONFIG_MTD=y
    CONFIG_MTD_PARTITION=y
    CONFIG_MTD_BYTE_WRITE=y
    CONFIG_MTD_SMART=y
    CONFIG_MTD_SMART_SECTOR_SIZE=1024
    CONFIG_MTD_SMART_WEAR_LEVEL=y
    CONFIG_MTD_W25=y
    CONFIG_FSUTILS_MKSMARTFS=y
  • 按照上面的配置,连接W25QF32V的线路如下:

    1
    2
    3
    4
    5
    6
    7
    8
    W25QF32V       STM32F103

    CS ----> PA4/NSS
    DO ----> PA6/MISO
    DI ----> PA7/MOSI
    CLK ----> PA5/SCK
    GND ----> GND
    VCC ----> 3V3
  • 格化格,挂载,读写测试.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    nsh> mkdir /tmp
    nsh> ls /
    /:
    dev/
    mnt/
    proc/
    nsh> mksmartfs /dev/smart0p1
    nsh> mount -t smartfs /dev/smart0p1 /tmp
    nsh> echo "11223456" > /tmp/file1.txt
    nsh> cat /tmp/file1.txt
    11223456
  • STM32F103C8T6上大部分的接口都描述了Pin的功能,少部分如:PB12,PB13...,这些需要查询stm32f103c8.pdf内的,Chapter 3, Table 5内容.

扩展使用SPI2

  • STM32F103C8T6最小板,引出了两个SPI:

    • SPI1: PA4(NSS), PA5(SCK), PA6(MISO), PA7(MOSI)
    • SPI2: PB12(NSS), PB13(SCK), PB14(MISO), PB15(MOSI)
  • 但是在Nuttx的里只开启了SPI1,同时只能接一个外设.所以,我要参照文件来修改它,目标是把SPI1连接到SD over SPI,SPI2连接到W25QF32V.

  • 根据FLASH_SPI1_CS添加FLASH_SPI2_CS宏定义,指定PB12,最终文件如下:

    1
    2
    3
    4
    5
    6
    7
    ~$ cat boards/arm/stm32/stm32f103-minimum/src/stm32f103_minimum.h
    [...]
    /* SPI chip selects */

    #define FLASH_SPI2_CS (GPIO_OUTPUT|GPIO_CNF_OUTPP|GPIO_MODE_50MHz|\
    GPIO_OUTPUT_SET|GPIO_PORTB|GPIO_PIN12)
    [...]
  • 修改stm32_spi.c,这个文件修改比较多,做成patch文件如:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    ~$ git diff  boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c  > spi2.patch
    diff --git a/boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c b/boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c
    index 6f3a585902..01b5b861e8 100644
    --- a/boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c
    +++ b/boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c
    @@ -75,7 +75,7 @@ void stm32_spidev_initialize(void)
    */

    #ifdef CONFIG_MTD_W25
    - stm32_configgpio(FLASH_SPI1_CS); /* FLASH chip select */
    + stm32_configgpio(FLASH_SPI2_CS); /* FLASH chip select */
    #endif

    #ifdef CONFIG_CAN_MCP2515
    @@ -197,7 +197,7 @@ void stm32_spi1select(FAR struct spi_dev_s *dev, uint32_t devid,
    #endif

    #ifdef CONFIG_MTD_W25
    - stm32_gpiowrite(FLASH_SPI1_CS, !selected);
    + //stm32_gpiowrite(FLASH_SPI1_CS, !selected);
    #endif
    }

    @@ -227,6 +227,9 @@ uint8_t stm32_spi1status(FAR struct spi_dev_s *dev, uint32_t devid)
    void stm32_spi2select(FAR struct spi_dev_s *dev, uint32_t devid,
    bool selected)
    {
    +#ifdef CONFIG_MTD_W25
    + stm32_gpiowrite(FLASH_SPI2_CS, !selected);
    +#endif
    }

    uint8_t stm32_spi2status(FAR struct spi_dev_s *dev, uint32_t devid)
    @@ -294,6 +297,16 @@ int stm32_spi1cmddata(FAR struct spi_dev_s *dev, uint32_t devid,
    return -ENODEV;
    }
    #endif
    +
    +#ifdef CONFIG_STM32_SPI2
    +int stm32_spi2cmddata(FAR struct spi_dev_s *dev, uint32_t devid,
    + bool cmd)
    +{
    + return -ENODEV;
    +}
    +#endif
    +
    +
    #endif

    #endif /* CONFIG_STM32_SPI1 || CONFIG_STM32_SPI2 */

  • 修改stm32_w25.c如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    git diff  boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c
    diff --git a/boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c b/boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c
    index 6e9d12718d..63ba5153ce 100644
    --- a/boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c
    +++ b/boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c
    @@ -47,7 +47,7 @@
    #include <errno.h>
    #include <debug.h>

    -#ifdef CONFIG_STM32_SPI1
    +#ifdef CONFIG_STM32_SPI2
    # include <nuttx/spi/spi.h>
    # include <nuttx/mtd/mtd.h>
    # include <nuttx/fs/smart.h>
    @@ -67,13 +67,13 @@
    * timer
    */

    -#define W25_SPI_PORT 1
    +#define W25_SPI_PORT 2

    /* Configuration ************************************************************/
    /* Can't support the W25 device if it SPI1 or W25 support is not enabled */

    #define HAVE_W25 1
    -#if !defined(CONFIG_STM32_SPI1) || !defined(CONFIG_MTD_W25)
    +#if !defined(CONFIG_STM32_SPI2) || !defined(CONFIG_MTD_W25)
    # undef HAVE_W25
    #endif

  • 修改stm32_bringup.c如下.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    ~$ git diff boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c
    diff --git a/boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c b/boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c
    index efa651034e..b4b0379bde 100644
    --- a/boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c
    +++ b/boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c
    @@ -143,7 +143,7 @@

    /* Can't support the W25 device if it SPI1 or W25 support is not enabled */

    -#if !defined(CONFIG_STM32_SPI1) || !defined(CONFIG_MTD_W25)
    +#if !defined(CONFIG_STM32_SPI2) || !defined(CONFIG_MTD_W25)
    # undef HAVE_W25
    #endif

开启串口调试

  • 如果没有在配置系统里打开相应的DEBUG选项,系统只出错时,只会给出一个简单错误号.下面是打开针对SPI,FS,MMC/SD的出错提示.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    CONFIG_DEBUG_ALERT=y
    CONFIG_DEBUG_FEATURES=y
    CONFIG_DEBUG_ERROR=y
    CONFIG_DEBUG_FS=y
    CONFIG_DEBUG_FS_ERROR=y
    CONFIG_DEBUG_IRQ=y
    CONFIG_DEBUG_IRQ_ERROR=y
    CONFIG_DEBUG_MEMCARD=y
    CONFIG_DEBUG_MEMCARD_ERROR=y
    CONFIG_DEBUG_SPI=y
    CONFIG_DEBUG_SPI_ERROR=y
    CONFIG_DEBUG_FULLOPT=y
    CONFIG_ARCH_HAVE_HARDFAULT_DEBUG=y
    CONFIG_ARCH_HAVE_MEMFAULT_DEBUG=y
    CONFIG_STM32_DISABLE_IDLE_SLEEP_DURING_DEBUG=y
1
2
nsh> mount -t vfat /dev/mmcsd1 /mnt/sd1
nsh: mount: mount failed: 19
  • 如果碰到类似的错误号,可以打开nuttx/include/errno.h,会看到如下所示:

    1
    2
    #define ENODEV              19
    #define ENODEV_STR "No such device"
  • 最终同时开启SPI1,SPI2的接口,包含了SPI,FS,MMC/SD,MTD一些必须选项,配置如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44

    CONFIG_STM32_SPI1=y
    CONFIG_STM32_SPI2=y
    CONFIG_STM32_SPI=y

    CONFIG_ARCH_HAVE_SPI_BITORDER=y
    CONFIG_SPI=y
    CONFIG_SPI_EXCHANGE=y
    CONFIG_SPI_CMDDATA=y
    CONFIG_SPI_DRIVER=y
    CONFIG_MMCSD=y
    CONFIG_MMCSD_NSLOTS=1
    CONFIG_MMCSD_SPI=y
    CONFIG_MMCSD_SPICLOCK=20000000
    CONFIG_MMCSD_SPIMODE=0
    CONFIG_MMCSD_IDMODE_CLOCK=400000

    CONFIG_MTD=y
    CONFIG_MTD_PARTITION=y
    CONFIG_MTD_BYTE_WRITE=y
    CONFIG_MTD_SMART=y
    CONFIG_MTD_SMART_SECTOR_SIZE=1024
    CONFIG_MTD_SMART_WEAR_LEVEL=y
    CONFIG_MTD_W25=y
    CONFIG_W25_SPIMODE=0
    CONFIG_W25_SPIFREQUENCY=20000000

    CONFIG_FS_NEPOLL_DESCRIPTORS=8
    CONFIG_FS_MQUEUE_MPATH="/var/mqueue"
    CONFIG_FS_FAT=y
    CONFIG_FS_SMARTFS=y
    CONFIG_SMARTFS_ERASEDSTATE=0xff
    CONFIG_SMARTFS_MAXNAMLEN=16
    CONFIG_FS_PROCFS=y
    CONFIG_FS_PROCFS_REGISTER=y
    CONFIG_FS_PROCFS_EXCLUDE_ENVIRON=y

    CONFIG_FSUTILS_MKFATFS=y
    CONFIG_FSUTILS_MKSMARTFS=y
    CONFIG_NSH_MMCSDMINOR=0
    CONFIG_NSH_MMCSDSLOTNO=0
    CONFIG_NSH_MMCSDSPIPORTNO=1
    CONFIG_NSH_CODECS_BUFSIZE=128

  • 测试系统如下.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    NuttShell (NSH) NuttX-9.1.0
    nsh> ls /dev
    /dev:
    console
    mmcsd0
    null
    smart0p0
    smart0p1
    smart0p2
    smart0p3
    ttyS0
    nsh> free
    total used free largest
    Umem: 17536 12872 4664 4664
    nsh> mkdir /mnt /p0
    nsh: mkdir: too many arguments
    nsh> mkdir /mnt
    nsh> mkdir /p0
    nsh> mount -t vfat /dev/mmcsd0 /mnt
    nsh> mount -t smartfs /dev/smart0p0 /p0
    nsh> df
    Block Number
    Size Blocks Used Available Mounted on
    16384 15611 3 15608 /mnt
    1024 64 11 53 /p0
    0 0 0 0 /proc
    nsh> free
    total used free largest
    Umem: 17536 14888 2648 2584
    nsh> ls /p0
    /p0:
    file.txt
    nsh> cat /p0/file.txt
    test
    nsh> free
    total used free largest
    Umem: 17536 14888 2648 2584
    nsh> ls /mnt
    /mnt:
    file1.txt
    nsh> cat /mnt/file1.txt
    1112222ssss

NUCLEO-L152RE (MB1136 c-03)

提交代码给NuttX

  • Github中文文档
  • Making Changes Using Git
  • NuttX代码是在Github上,所以要为它提交代码,需要先有Github帐号.
  • 在自己的GithubforkNuttX.再在本地克隆刚才forkNuttX.
  • 并且把原https://github.com/apache/incubator-nuttx.git设置成上游(upstream).
1
2
~$ git clone <your forked incubator-nuttx project clone url>
~$ git remote add upstream https://github.com/apache/incubator-nuttx.git
  • 创建一个本地的开发分支,并且把它pull自己fork项目中去.
    1
    2
    3
    4
    5
    6
    7
    8
    ~$ git checkout -b dev/stm32l152re-ili93418b-driver
    Switched to a new branch 'dev/stm32l152re-ili93418b-driver'

    ~$ git push
    fatal: The current branch dev/stm32l152re-ili93418b-driver has no upstream branch.
    To push the current branch and set the remote as upstream, use

    git push --set-upstream origin dev/stm32l152re-ili93418b-driver
  • 按照提示,需要关联到远程分支,把本地分支dev/stm32l152re-ili93418b-driver推送到origin远程分支上.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    ~$ git push --set-upstream origin dev/stm32l152re-ili93418b-driver
    Username for 'https://github.com': xxxxxx
    Password for 'https://xxxxxxx@github.com':
    Total 0 (delta 0), reused 0 (delta 0)
    remote:
    remote: Create a pull request for 'dev/stm32l152re-ili93418b-driver' on GitHub by visiting:
    remote: https://github.com/xxxxxx/incubator-nuttx/pull/new/dev/stm32l152re-ili93418b-driver
    remote:
    To https://github.com/xxxxx/incubator-nuttx
    * [new branch] dev/stm32l152re-ili93418b-driver -> dev/stm32l152re-ili93418b-driver
    Branch 'dev/stm32l152re-ili93418b-driver' set up to track remote branch 'dev/stm32l152re-ili93418b-driver' from 'origin'.
  • 关联到远程分支后,以后就可以直接push,查看.git/config如下
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    ~$ cat .git/config
    [core]
    repositoryformatversion = 0
    filemode = true
    bare = false
    logallrefupdates = true
    [remote "origin"]
    url = https://github.com/xxxxx/incubator-nuttx
    fetch = +refs/heads/*:refs/remotes/origin/*
    [branch "master"]
    remote = origin
    merge = refs/heads/master
    [remote "upstream"]
    url = https://github.com/apache/incubator-nuttx.git
    fetch = +refs/heads/*:refs/remotes/upstream/*
    [branch "dev/stm32l152re-ili93418b-driver"]
    remote = origin
    merge = refs/heads/dev/stm32l152re-ili93418b-driver
  • 读取上游的更新并且合并到本地的master分支.再push到自己的origin的创库.
    1
    2
    3
    4
    ~$ git checkout master  # 本地切换到master分支
    ~$ git fetch upstream # 读取上游更改,同步.
    ~$ git merge upstream/master # 合并上游到本地
    ~$ git push # push同步,把本的master同步远端的origin/master
  • 创建新的更改或文件并且push到远程分支
    1
    2
    3
    ~$ git add new-file.c
    ~$ git commit new-file.c
    ~$ git push
  • 此时github的分支上面,会有一个提示,Create Pull Request,把当前的分支推送到upstream上去,提交合并请求.
  • 如果运行rebase upstream/master后,发现代码有问题,可以使用git reset --hard HEAD~1回退到指定的节点,~[num]回退第几个结点.这里可以使用gitg图形工具可以直观的显示.假如这里是回退到创建分支时第一次commit之前, 在些可以重新修改或添加文件,再重新commit, 并且要使用git push --force才行.之后再可以git fetch upstream; git rebase upstream/master.

合并多个commit为一个完整commit

  • 为了保持提交记录的更简洁明了,需要把多个commit合并到一个完整的commit,然后再push到上游库中.命令的方式是:git rebase -i [startpoint] [endpoint].如果不指定endpoint,则该区间的终点默认是当前分支HEAD所指向的commit.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    ~$ git rebase -i HEAD~3
    pick a68185edc0f2cbd38c8fdbcffaf516278f4f Fix merge conflicts
    pick efe10a20278f53af9e6fff5754de39b8c8c4 net/icmp: add sanity check to avoid wild data length
    pick fb7480c67637bfa2164f4f76ceff6f509d24 net/neighbor/neighbor_ethernet_out.c: fix build error without ICMPv6
    pick 0a262336bd9964b693b57fe93d992482d5d3 arch/arm/src/stm32/stm32_otghsdev.c: Fix syslog formats
    [....]
    # Rebase dd4b5e0c68..18d489a8dd onto dd4b5e0c68 (32 commands)
    #
    # Commands:
    # p, pick <commit> = use commit /* 保留该commit */
    # r, reword <commit> = use commit, but edit the commit message /* 保留该commit,但我需要修改该commit的注释 */
    # e, edit <commit> = use commit, but stop for amending /* 保留该commit, 但我要停下来修改该提交(不仅仅修改注释) */
    # s, squash <commit> = use commit, but meld into previous commit /* 将该commit和前一个commit合并 */
    # f, fixup <commit> = like "squash", but discard this commit's log message /* 如sqaush,但是不保留该提交的注释信息 */
    # x, exec <command> = run command (the rest of the line) using shell /* 执行shell命令 */
    # b, break = stop here (continue rebase later with 'git rebase --continue')
    # d, drop <commit> = remove commit /* 丢弃该commit */
    # l, label <label> = label current HEAD with a name
    # t, reset <label> = reset HEAD to a label
    # m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
    # . create a merge commit using the original merge commit's
    # . message (or the oneline, if no original merge commit was
    # . specified). Use -c <commit> to reword the commit message.
    #
    # These lines can be re-ordered; they are executed from top to bottom.
    #
    # If you remove a line here THAT COMMIT WILL BE LOST.
    #
    # However, if you remove everything, the rebase will be aborted.
    #
    # Note that empty commits are commented out
  • 如上面的说明一样,支持多种编辑,现在假如改成如下所示.修改完成后保存,就会转到注释的修改界面,再保存.它就执行了rebase,再用git push -f把相应的修改强制提交上去.
    1
    2
    3
    4
    pick c029a68185edc0f2cbd38c8fdbcffaf516278f4f Fix merge conflicts
    s efe10a20278f53af9e6fff5754de39b8c8c4 net/icmp: add sanity check to avoid wild data length
    s fb7480c67637bfa2164f4f76ceff6f509d24 net/neighbor/neighbor_ethernet_out.c: fix build error without ICMPv6
    f 0a262336bd9964b693b57fe93d992482d5d3 arch/arm/src/stm32/stm32_otghsdev.c: Fix syslog formats
  • 它其实是在.git下面的文件.
    1
    2
    3
    4
    5
    6
    ~$ head  .git/rebase-merge/git-rebase-todo.backup
    pick a68185edc0f2cbd38c8fdbcffaf516278f4f Fix merge conflicts
    pick efe10a20278f53af9e6fff5754de39b8c8c4 net/icmp: add sanity check to avoid wild data length
    pick fb7480c67637bfa2164f4f76ceff6f509d24 net/neighbor/neighbor_ethernet_out.c: fix build error without ICMPv6
    pick 0a262336bd9964b693b57fe93d992482d5d3 arch/arm/src/stm32/stm32_otghsdev.c: Fix syslog formats
    [....]
  • 分支合并到主线时,可以删除该分支.
    1
    2
    ~$ git branch -d <本地分支>
    ~$ git push origin --delete <远端分支>
  • 如果因为rebase时所产生的CONFLICT (content): Merge conflict in src/xxxx.cpp,并且确定冲突可以以一方为标准处理,可以使用下面命令自动处理.--theirs就是使用pull上游仓库版本,--ours是使用本地版本,再把它commit就算是合并成功了.
    1
    2
    ~$ git checkout --theirs src/xxxx.cpp
    ~$ git commit

Git其它应用

1
~$ git format-patch -1 HEAD
  • 在自动处理合并分支的冲突时,如果只有一边是修改的情况下,参考这里可以使用如下:
1
~$ git merge --strategy-option theirs

git提交时自动修改(自增)版本号的文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
~$ cat .git/hooks/pre-commit
#!/bin/bash

export VER_FILE=src/utils/utils.pri

if [ ! -e ${VER_FILE} ]; then
exit 1;
fi

echo "start to update the version in $VER_FILE"
# 211.8.73
export VERSION=$(grep "^VERSION =" ${VER_FILE} | awk '{print $3}' | tr -d '\r')
# VARR(211 8 73)
IFS='.' read -r -a VARR <<< "$VERSION"
# VARR(211 8 74)
VARR[2]=$((VARR[2]+1))
# 211. 8 .74
VERSION=$(echo ${VARR[@]} | fold -w3 | paste -sd.)
# replace 211.8.73 with 211.8.74
sed -i "/^VERSION/s/=.*/= ${VERSION// /}/" ${VER_FILE}

git使用图形工具(meld)对比代码

  • .git/config添加下面内容

    1
    2
    3
    4
    5
    6
    7
    # Add the following to your .gitconfig file.
    [diff]
    tool = meld
    [difftool]
    prompt = false
    [difftool "meld"]
    cmd = meld "$LOCAL" "$REMOTE"
  • 对比不同分支上的同一个文件

    1
    ~$ git difftool mybranch master -- target.file
  • 对比当前分支与other-branchsrc目录下的文件,--后的参数可以省掉,也可以具体指某一个文件或者目录名.

    1
    ~$ git difftool other-branch -- src
  • 当前分支的文件与其它的commit对比.

    1
    ~$ git difftool HEAD~2 -- src/file.txt
  • 合并other-branch到当前分支,并使用other-branch来解决冲突(theirs).

    1
    ~$ git merge -X theirs other-branch
  • 回退一个合并

    1
    git reset --merge HEAD~1
  • 强制fetch远程仓库,覆盖本地仓库,替代pull时冲突提示(慎用)

    1
    2
    3
    4
    # fetch from the default remote, origin
    ~$ git fetch
    # reset your current branch (master) to origin's master
    ~$ git reset --hard origin/master

STM32F4-Discovery(MB997C)

Audio(CS43L22)支持

SPI SD卡支持

USB-OTG

BLE Sniffer

Ethernet 8720A

Arm Mbed-OS

配置开发环境

  • 这里使用的是当前最新的v6.3版本.Mbed支持三种不同类型的开发环境(桌面IDE,在线IDE,命令行),且支持多系统平台(Win,Mac,Linux),这是使用Keil,IAR所不能比拟的.这里会根据官方文档说明,实践一下桌面与命令行的开发.而且它使用的是C++语言开发,这样有别于传统的Keil,IAR的C语开发,它的代码风格与Arduino一样.并且它内置的实时操作系统就是FreeRTOS.

Mbed Studio

  • 使用Mbed Studio需要注册一个帐号,安装完成第一次打开IDE,会需要先登录.它与Web Stduio还有同一个优点,可以从线上导入官方的模版工程.

    1
    2
    3
    4
    5
    6
    ~$ wget -c https://studio.mbed.com/installers/latest/linux/MbedStudio.sh
    ~$ ./MbedStudio.sh
    ~$ du -sh ~/.local/bin/mbed-studio
    ~$ ls ~/.config/Mbed Studio
    api-targets.json Cache Cookies GPUCache library-pipeline mbed-studio.log 'Network Persistent State'
    blob_storage config.json Cookies-journal library-cache 'Local Storage' mbed-studio-tools recentworkspace.json
  • Mbed Studio原本是使用Arm Compiler 6,这里可以也可以切换到Arm Embedded GCC Compiler

1
2
3
4
5
6
7
8
~$ cat > ~/.config/Mbed Studio/external-tools.json <<EOF
> {
> "bundled": {
> "gcc": "/fullpath/gcc-arm-none-eabi-9-2020-q2-update/bin"
> },
> "defaultToolchain": "GCC_ARM"
> }
> EOF
  • 如上面所示,Mbed Studio也是如同VScode这样的IDE,也是使用json格式配置文件,
    打开界面菜单:File -> Settings -> Open Perferences.

测试FRDM-KL25Z

更新OpenSDA的固件

  • Mbed Studio需要最新的固件来支持调试FRDM-KL25Z,至少需要mbed_if_v2.0_frdm_kl25z.s19才能支持CMSIS-DAP,所以需按照上述链接下载到固件(Pemicro_OpenSDA_Debug_MSD_Update_Apps_2020_05_12.zip).解压后,文件夹有*.SDA后缀的固件文件,还有一些Notes与指导手册等.电脑连到KL25ZSDA接口时,会在系统内看到一个FRDM-KL25Z的盘符.
  • 这里还有一个问题,升级固件时,必须让板子进入Bootloder模式,此时它会挂载盘符叫BOOTLODER.发现只能Windows的系统下进行升级,按住板子上的RST键,接入SDA上电,因为板子内的固件是v1.01,在Linux下无法挂载出一个USB盘符, 而只能在windows XP,Win 7下是可以挂载的且可以升级成功.这里原因没有深究它了.
  • 解压固件后的目录内还有一个OpenSDA_Bootloader_Update_App_v111_2013_12_11.zip,再解压出就是BOOTUPDATEAPP_Pemicro_v111.SDA要升级的固件文件,这里把MSD-DEBUG-FRDM-KL25Z_Pemicro_v118.SDA,BOOTUPDATEAPP_Pemicro_v111.SDA,20140530_k20dx128_kl25z_if_opensda.s19三个文件直接复制进BOOTLOADER盘符内,就可以升级了.升级完成后,接入SDA后在Linux下会自动挂载一个MBED盘.
  • 再次按住板子上的RST键,接入SDA上电,进入BOOTLOADER模式,打开BOOTLOADER盘内的SDA_INFO.HTM跳转到网页,查看网页的信息是否与升级的固件版本对应上.并且固件在v1.11进,也就可以自动在Linux下挂载BOOTLOADER盘了,可以方便的进行后续版本的升级了.
  • 新建工程,导入官方示例mbed-os-example-blinky,界面如下:
    mbed-studio-blinky-project.png
  • 更新后,可以使用openocd连接调试.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    ~$ openocd -c "adapter driver cmsis-dap" -f board/frdm-kl25z.cfg
    Open On-Chip Debugger 0.10.0+dev-01423-g3ffa14b04-dirty (2020-10-14-08:59)
    Licensed under GNU GPL v2
    For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
    Warn : Interface already configured, ignoring
    Info : auto-selecting first available session transport "swd". To override use 'transport select <transport>'.
    Info : add flash_bank kinetis kl25.pflash
    Info : Listening on port 6666 for tcl connections
    Info : Listening on port 4444 for telnet connections
    Info : CMSIS-DAP: SWD Supported
    Info : CMSIS-DAP: FW Version = 1.0
    Info : CMSIS-DAP: Interface Initialised (SWD)
    Info : SWCLK/TCK = 0 SWDIO/TMS = 1 TDI = 0 TDO = 0 nTRST = 0 nRESET = 1
    Info : CMSIS-DAP: Interface ready
    Info : clock speed 1000 kHz
    Info : SWD DPIDR 0x0bc11477
    Info : SWD DPIDR 0x0bc11477
    Error: Failed to write memory at 0xe000edf0
    Info : kl25.cpu: external reset detected
    Warn : **** Your Kinetis MCU is probably locked-up in RESET/WDOG loop. ****
    Warn : **** Common reason is a blank flash (at least a reset vector). ****
    Warn : **** Issue 'kinetis mdm halt' command or if SRST is connected ****
    Warn : **** and configured, use 'reset halt' ****
    Warn : **** If MCU cannot be halted, it is likely secured and running ****
    Warn : **** in RESET/WDOG loop. Issue 'kinetis mdm mass_erase' ****
    Info : starting gdb server for kl25.cpu on 3333
    Info : Listening on port 3333 for gdb connections

Mbed CLI

  • 这里是在Linux下面的安装需求,需要系统先安装了Git,Python3.7.x,Mercurial等环境.
    1
    2
    ~$ sudo apt-get install python3 python3-pip git mercurial -y
    ~$ pip install mbed-cli

安装配置交叉工具链

1
2
3
4
5
6
7
8
9
~$ mbed config -G ARM_GCC_PATH /fullpath/gcc-arm-none-eabi-9-2020-q2-update/bin
[mbed] fullpath/gcc-arm-none-eabi-9-2020-q2-update/bin now set as global ARM_GCC
~$ mbed config --list
[mbed] Global config:
GCC_ARM_PATH=/fullpath/gcc-arm-none-eabi-9-2020-q2-update/bin
ARMC6_PATH=/fullpath/ARMCompiler6.15/bin

[mbed] Local config (/home/michael):
Couldn't find valid mbed program in /home/michael

新建工程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
~$ mbed new mbed-example-program
[mbed] Working path "/fullpath/Mbed Programs" (directory)
[mbed] Creating new program "mbed-example-program" (git)
[mbed] Adding library "mbed-os" from "https://github.com/ARMmbed/mbed-os" at branch/tag "latest"
[mbed] Updating reference "mbed-os" -> "https://github.com/ARMmbed/mbed-os/#0db72d0cf26539016efbe38f80d6f2cb7a3d4414"
[mbed] Auto-installing missing Python modules (mbed_cloud_sdk, mbed_ls, mbed_host_tests, mbed_greentea, manifest_tool, icetea, pycryptodome, cryptography)...
~$ tree -L 1 mbed-example-program/
mbed-example-program/
├── mbed_app.json
├── mbed-os
├── mbed-os.lib
└── mbed_settings.py

1 directory, 3 files
~$ mbed ls -a
[mbed] Working path "/fullpath/Mbed Programs/mbed-example-program" (program)
mbed-example-program (mbed-example-program)
`- mbed-os (https://github.com/ARMmbed/mbed-os#0db72d0cf265)

  • 上面新建的工程,默认添加了mbed-os的支持,也可以使用--create-only选项创建不含系统的工程.
    1
    2
    3
    4
    5
    6
    ~$ mbed new project2 --create-only
    [mbed] Working path "/fullpath/Mbed Programs" (directory)
    [mbed] Creating new program "project2" (git)
    ~$ tree -L 1 project2/
    project2/
    └── mbed_settings.py

导入工程

1
2
3
4
~$ mbed import https://github.com/ARMmbed/mbed-os-example-blinky#mbed-os-5.15.0  my-blink
[mbed] Working path "/fullpath/github/ArmMbed" (directory)
[mbed] Importing program "my-blink" from "https://github.com/ARMmbed/mbed-os-example-blinky" at branch/tag "mbed-os-5.15.0"
[mbed] Adding library "mbed-os" from "https://github.com/ARMmbed/mbed-os" at rev #64853b354fa1

为工程添加库

1
2
3
4
5
6
~$ cd my-blink
~$ mbed add https://github.com/ARMmbed/mbed-cloud-client
[mbed] Working path "/fullpath/Mbed Programs/my-blink" (program)
[mbed] Adding library "mbed-cloud-client" from "https://github.com/ARMmbed/mbed-cloud-client" at latest revision in the current branch
[mbed] Updating reference "mbed-cloud-client" -> "https://github.com/ARMmbed/mbed-cloud-client/#f72a23e0dc21de4c82ee53fe947153341419a5b9"

  • 如果要删除库就直接:mbed remove mbed-cloud-client

编译工程

  • 查看可以支持的板子.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    ~$ mbed compile --supported  # mbed compile -S
    [mbed] Working path "/fullpath/ArmMbed/mbed-os-example-blinky" (program)
    | Target | mbed OS 2 | mbed OS 5 | ARM | uARM | GCC_ARM | IAR |
    | ------------- | --------- | --------- | --------- | ---- | --------- | --------- |
    | ADV_WISE_1510 | - | Supported | Supported | - | Supported | Supported |
    | ADV_WISE_1570 | - | Supported | Supported | - | Supported | Supported |
    | ARCH_MAX | - | Supported | Supported | - | Supported | Supported |
    | ARCH_PRO | - | Supported | Supported | - | Supported | Supported |
    [......]

    ~$
  • 编译

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    ~$ mbed compile -m KL25Z -t GCC_ARM
    [mbed] Working path "/fullpath/ArmMbed/my-blink" (program)
    Building project my-blink (KL25Z, GCC_ARM)
    Scan: my-blink
    Compile [ 0.4%]: at24mac.cpp
    [...]
    Link: my-blink
    Elf2Bin: my-blink
    | Module | .text | .data | .bss |
    | ---------------- | ------------- | ----------- | ----------- |
    | [fill] | 48(+48) | 0(+0) | 28(+28) |
    | [lib]/c.a | 4828(+4828) | 2108(+2108) | 89(+89) |
    | [lib]/gcc.a | 1004(+1004) | 0(+0) | 0(+0) |
    | [lib]/misc | 200(+200) | 4(+4) | 28(+28) |
    | main.o | 84(+84) | 0(+0) | 0(+0) |
    | mbed-os/drivers | 92(+92) | 0(+0) | 0(+0) |
    | mbed-os/hal | 1440(+1440) | 4(+4) | 67(+67) |
    | mbed-os/platform | 4204(+4204) | 264(+264) | 220(+220) |
    | mbed-os/rtos | 6468(+6468) | 168(+168) | 5973(+5973) |
    | mbed-os/targets | 2424(+2424) | 4(+4) | 19(+19) |
    | Subtotals | 20792(+20792) | 2552(+2552) | 6424(+6424) |
    Total Static RAM memory (data + bss): 8976(+8976) bytes
    Total Flash memory (text + data): 23344(+23344) bytes

    Image: ./BUILD/KL25Z/GCC_ARM/my-blink.bin
  • 设置默认的target与交叉工具链

    1
    2
    3
    4
    5
    6
    7
    ~$ mbed target KL25Z
    [mbed] Working path "/fullpath/Mbed Programs/my-blink" (program)
    [mbed] KL25Z now set as default target in program "my-blink"
    ~$ mbed toolchain GCC_ARM
    [mbed] Working path "/fullpath/Mbed Programs/my-blink" (program)
    [mbed] GCC_ARM now set as default toolchain in program "my-blink"

测试与调试

  • 运行代码测试

    1
    2
    ~$ mbed test -m KL25Z -t GCC_ARM
    [...]
  • 查看测试列表

    1
    2
    3
    4
    5
    6
    7
    8
    9
    ~$ mbed test --compile-list  | head
    Test Case:
    Name: mbed-os-features-device_key-tests-device_key-functionality
    Path: ./mbed-os/features/device_key/TESTS/device_key/functionality
    Test Case:
    Name: mbed-os-features-frameworks-utest-tests-unit_tests-basic_test
    Path: ./mbed-os/features/frameworks/utest/TESTS/unit_tests/basic_test
    Test Case:
    [....]
  • 连接板子运行测试

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    ~$ mbed test -m KL25Z -t GCC_ARM --run
    [mbed] Working path "/home/michael/3TB-DISK/Mbed Programs/my-blink" (program)
    mbedgt: greentea test automation tool ver. 1.7.4
    mbedgt: test specification file './BUILD/tests/KL25Z/GCC_ARM/test_spec.json' (specified with --test-spec option)
    mbedgt: using './BUILD/tests/KL25Z/GCC_ARM/test_spec.json' from current directory!
    mbedgt: detecting connected mbed-enabled devices...
    mbedgt: detected 1 device
    mbedgt: processing target 'KL25Z' toolchain 'GCC_ARM' compatible platforms... (note: switch set to --parallel 1)
    mbedgt: running 4 tests for platform 'KL25Z' and toolchain 'GCC_ARM'
    mbedgt: mbed-host-test-runner: started
    mbedgt: retry mbedhtrun 1/1
    mbedgt: ['mbedhtrun', '-m', 'KL25Z', '-p', '/dev/ttyACM0:9600', '-f', '"BUILD/tests/KL25Z/GCC_ARM/mbed-os/TESTS/psa/spm_smoke/spm_smoke.bin"', '-e', '"mbed-os/TESTS/host_tests"', '-d', '/media/michael/MBED', '-c', 'default', '-t', '02000201242BD1925E8A1EE0', '-r', 'default', '-C', '4', '--sync', '5', '-P', '60'] failed after 1 count
    [...]

Ardunio

  • Links:
    • stm32duino
    • wiki
    • Nucleo-F767ZINucleo-L152RE都是兼容Arduino管脚的,这里是使用Nucleo-L152RE为目标对像.

添加STM32 Cores

烧写方式

  • 下载stm32-programmers,解压后,直接运行Linux下的安装程序,根据向导提示,安装在当前用户的目录下.安装完后,目录如下:

    1
    2
    3
    4
    ~/STMicroelectronics/STM32Cube/STM32CubeProgrammer/bin$ ls
    ExternalLoader HSM libssl.so libstp11_SAM.so.conf STM32CubeProgrammer STM32MP_KeyGen_CLI STM32_Programmer_CLI
    FlashLoader libcrypto.so libstp11_SAM.so RSSe STM32CubeProgrammerLauncher STM32MP_SigningTool_CLI STM32_Programmer.sh

  • 这里是使用Nucleo-L152RE为目标板,选择Tools -> Boards: <any> -> STM32 Boards (select from submenu -> Nucleo-64.

  • Upload方法,选择Tools -> Upload method -> STM32CubeProgrammer (SWD)

更新ST-Link的固件

  • JTAG and SWD Guide

  • 烧写时提示如下的错误.

    1
    2
    3
    4
    5
    6
    7
    8
     STM32CubeProgrammer v2.4.0
    -------------------------------------------------------------------

    Error: Old ST-LINK firmware version. Upgrade ST-LINK firmware
    Error: Old ST-LINK firmware version. Upgrade ST-LINK firmware
    Error: Old ST-LINK firmware!Please upgrade it.
    Error: Old ST-LINK firmware!Please upgrade it.

  • 下载stsw-link007,后解压它.

  • 解压目录如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    ~$ tree -L 2  stsw-link007
    stsw-link007
    ├── AllPlatforms
    │   ├── native
    │   ├── StlinkRulesFilesForLinux
    │   └── STLinkUpgrade.jar
    ├── readme.txt
    └── Windows
    ├── ST-LinkUpgrade.exe
    └── STLinkUSBDriver.dll
  • 根据它的readme.txt提示,安装完成 StlinkRulesFilesForLinux后下面,就可以运行更新GUI程序更新固件了.

    1
    ~$ java -jar STLinkUpgrade.jar
  • 更新完最新的固件后可以,用ST官方STM32CubeProgrammer读写,但是想用它的ST-link外接给STM32F103最小系统板做SWD烧写调试时出错了.

1
2
3
4
5
6
7
8
9
~$ st-info --probe
Found 1 stlink programmers
serial: 30363641464634383535353037353531383731xxxxxx
hla-serial: "\x30\x36\x36\x41\x46\x46\x34\x38\x35\x35\x35\x30\x37\x35\x35\x31\x38\x37\x31\x38\x32\x37\x34\x37"
flash: 2097152 (pagesize: 2048)
sram: 524288
chipid: 0x0451
descr: F76xxx

  • stlink-gui
1
~$ apt-get install stlink-gui
  • 连接成功后读取内存如下。

SPI SD示例

  • 这里选择标准库内的File -> Examples -> Examples for any board -> SD -> listfiles的示例,程序如下所示,管脚定义兼容Nucleo-L152RE,只是CS这里与原程序定义不符.Nucleo-L152RE里是pin 10 (CS),所以只需要在程序内改成!SD.begin(10).
1
2
3
4
5
6
7
8
9
10
11
12
13
The circuit:
SD card attached to SPI bus as follows:
** MOSI - pin 11
** MISO - pin 12
** CLK - pin 13
** CS - pin 4 (for MKRZero SD: SDCARD_SS_PIN)

[...]
if (!SD.begin(10)) {
Serial.println("initialization failed!");
while (1);
}
[...]
  • 一键upload,如果编译烧写成功后,使用串口查看它的输出.

树莓派相关

FreeRTOS

协议分析工具

FTDI232H

PyFTDI

I2C

  • i2c通信, SCL(AD0),SDA(AD1,AD2). I2C的地址是一个7-bit的数值,加上第8-bit方向位(0:写,1:读),构成一个8-bit数值.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19

    from pyftdi.i2c import I2cController
    # Instantiate an I2C controller
    i2c = I2cController()

    # Configure the first interface (IF/1) of the FTDI device as an I2C master
    i2c.configure('ftdi://ftdi:2232h/1')

    # Get a port to an I2C slave device
    slave = i2c.get_port(0x21)

    # Send one byte, then receive one byte
    slave.exchange([0x04], 1)

    # Write a register to the I2C slave
    slave.write_to(0x06, b'\x00')

    # Read a register from the I2C slave
    slave.read_from(0x00, 1)

SPI

  • 下面是通过使用UM232H来读取一颗裸SPI NOR Flash W25Qxx的厂商编号(jedec_id).接线如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    UM232H           W25Q64FV
    AD0 <-----> CLK pin6
    AD1 <-----> DI pin5
    AD2 <-----> DO pin2
    AD3 <-----> CS pin1
    GND <-----> GND pin4
    VCC <-----> VCC pin8
    VCC <-----> /HOLD pin7
    VCC <-----> /WP pin3
  • 简单测试读取jedec_id

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import usb
import usb.util
from pyftdi.spi import SpiController
dev = usb.core.find(idVendor=0x0403, idProduct=0x6014)

spi = SpiController()
spi.configure(dev)

# Get a port to a SPI slave w/ /CS on A*BUS3 and SPI mode 0 @ 12MHz
slave = spi.get_port(cs=0,freq=12E6,mode=0)

jedec_id = slave.exchange([0x9f],3)

print(hex(jedec_id[1]<< 8 | jedec_id[2]))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
pyspiflash/spiflash/tests$ ./serialflash.py
Using FTDI device ftdi://ftdi:232h:1:67/1
Flash device: Winbond W25Q64 8 MiB @ SPI freq 12.0 MHz
.Read 8192 KiB in 7 seconds @ 1152 KiB/s
..Erase 1024 KiB from flash @ 0x700000 (may take a while...)
Erased 1024 KiB in 17 seconds @ 59 KiB/s
Build test sequence
Writing 1024 KiB to flash (may take a while...)
Wrote 1024 KiB in 14 seconds @ 70 KiB/s
Reading 1024 KiB from flash
Read 1024 KiB in 915 ms @ 1118 KiB/s
Verify flash
Reference: 4942ea371ad576065759f232f429a8abf10c755a
Retrieved: 4942ea371ad576065759f232f429a8abf10c755a
...
----------------------------------------------------------------------
Ran 6 tests in 42.775s

OK

FlashROM读取

  • FT2232SPI_Programmer

  • Unpacking the binary firmware /w Binwalk

  • Zyxel firmware extraction and password analysis

  • flashrom

  • 探测flash的类型。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    ~$ flashrom -L

    ~$ flashrom -p ft2232_spi:type=2232H,port=A
    flashrom v1.2 on Linux 5.16.13-20220310 (x86_64)
    flashrom is free software, get the source code at https://flashrom.org

    Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
    Found Macronix flash chip "MX25L6405" (8192 kB, SPI) on ft2232_spi.
    Found Macronix flash chip "MX25L6405D" (8192 kB, SPI) on ft2232_spi.
    Found Macronix flash chip "MX25L6406E/MX25L6408E" (8192 kB, SPI) on ft2232_spi.
    Found Macronix flash chip "MX25L6436E/MX25L6445E/MX25L6465E/MX25L6473E/MX25L6473F" (8192 kB, SPI) on ft2232_spi.
    Multiple flash chip definitions match the detected chip(s): "MX25L6405", "MX25L6405D", "MX25L6406E/MX25L6408E", "MX25L6436E/MX25L6445E/MX25L6465E/MX25L6473E/MX25L6473F"
    Please specify which chip definition to use with the -c <chipname> option.

  • 读取flash内容。

1
2
3
4
5
6
7
8
~$ flashrom -p ft2232_spi:type=2232H,port=A -r test-mx25l6445e.rom -c "MX25L6436E/MX25L6445E/MX25L6465E/MX25L6473E/MX25L6473F"
flashrom v1.2 on Linux 5.16.13-20220310 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Found Macronix flash chip "MX25L6436E/MX25L6445E/MX25L6465E/MX25L6473E/MX25L6473F" (8192 kB, SPI) on ft2232_spi.
Reading flash... done.

屏幕驱动

OpenOCD

  • OpenOCD支持FTDIMPSSE功能,--enable-ftdi Enable building support for the MPSSE mode of FTDI
1
2
~$ cd openocd
~$ ./configure --enable-sysfsgpio --enable-buspirate --enable-ftdi
  • 除了需要开启openocd的支持,还有就是接线,这里试用了interface/ftdi/ft232h-module-swd.cfg,interface/ftdi/minimodule-swd.cfg两个配置文件都可以连接成功.配置文件内有接线注释,参考上面的一些链接好像是说要在ADBUS1,ADBUS2中间接一个470 ohm的电阻, 但是这里没有接.只是要确认配置文件内的vid_pid与所连接的硬件匹配.
  • 使用Jlink-ob或者其它板上的STLink只需要三根线,但是这里必需要接nTRST,如:stm32f103zet6就是PB4.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    # FT232HQ minimodule channel 0 (Channel A)
    # Connector FTDI Target
    # Pin Name
    # --------- ------ ------
    # CN2-10 GND GND
    # CN2-13 ADBUS0 (TCK) SWCLK
    # CN2-14 ADBUS2 (TDI/TDO) SWDIO
    # CN2-15 ADBUS1 (TDO/TDI) SWDIO
    # CN2-17 ADBUS4 (GPIOL0) nTRST

1
2
~$ openocd -f interface/ftdi/ft232h-module-swd.cfg -f target/stm32f1x.cfg -c init \
-c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000"

树莓派(RassberyPi)

Saleae 逻辑分析仪

Saleae AnalzerSDK

SDIO协议插件

  • SaleaeSDIOAnalyzer
  • AnalyzerSDKSaleaeSDIOAnalyzer两个目录必需要在同一级目录内,再进入到SaleaeSDIOAnalyzer内,直接cmake . && make.如果无错,就会生成libSDIOAnalyzer.so.
  • Configure Logic to look for the Analyzer Plugin
    Launch Logic manually
    Options -> Preferences
    Under [For Developers], “Search this path for Analyzer Plugins”
    Browse for the ../sdmmc-analyzer/xcode4/build/Debug directory
    Click “Save” and close Logic

SDMMC协议

  • SD/MMC Analyzer for Logic
  • 这个源码是用python脚本去编译的,这里在它的目录建一个CMake脚本来处理编译.成功后会看到一个libSDMMCAnalyzer.so.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
sdmmc-analyzer$ cat CMakeLists.txt
project("Saleae SDMMC Analyzer")
cmake_minimum_required(VERSION 3.0)

message(WARNING "CMake support is still experimental!")

# Find Analyzer include dir
find_path(
ANALYZER_SDK_INCLUDE_DIR
NAMES
Analyzer.h
AnalyzerChannelData.h
AnalyzerHelpers.h
AnalyzerResults.h
AnalyzerSettings.h
AnalyzerTypes.h
SimulationChannelDescriptor.h
PATHS
../include/
../AnalyzerSDK/include
DOC
"Include directory of the analyzer SDK."
)

if(NOT ANALYZER_SDK_INCLUDE_DIR)
message(SEND_ERROR "Analyzer SDK include directory not found")
else()
message(STATUS
"Analyzer SDK include directory found at ${ANALYZER_SDK_INCLUDE_DIR}")
endif()

# needed to differ between 32 and 64 bit library
set(ANALYZER_BITNESS)
set(ANALYZER_LIB_NAME "")

if(CMAKE_SIZEOF_VOID_P EQUAL 4)
message(STATUS "32 Bit detected")
set(ANALYZER_BITNESS 32)
set(ANALYZER_LIB_NAME "Analyzer")
elseif(CMAKE_SIZEOF_VOID_P EQUAL 8)
message(STATUS "64 Bit detected")
set(ANALYZER_BITNESS 64)
set(ANALYZER_LIB_NAME "Analyzer64")
else()
message(FATAL_ERROR "Environment not supported")
endif()

if(NOT (WIN32 OR UNIX))
# I have no idea what to do under MacOS
message(WARNING "Environment may not be supported")
endif()

# find library
find_library(
ANALYZER_SDK_LIBRARY
NAMES
${ANALYZER_LIB_NAME}
PATHS
../lib/
../AnalyzerSDK/lib/
DOC
"Analyzer SDK library. \
If you set it yourself, choose the correct architecture"
)

if(NOT ANALYZER_SDK_LIBRARY)
message(SEND_ERROR "Analyzer SDK library not found")
else()
message(STATUS "Analyzer SDK library found at ${ANALYZER_SDK_LIBRARY}")
endif()


add_library(SDMMCAnalyzer SHARED
src/SDMMCAnalyzer.cpp
src/SDMMCAnalyzer.h
src/SDMMCAnalyzerResults.cpp
src/SDMMCAnalyzerResults.h
src/SDMMCAnalyzerSettings.cpp
src/SDMMCAnalyzerSettings.h
src/SDMMCHelpers.cpp
src/SDMMCHelpers.h
src/SDMMCSimulationDataGenerator.cpp
src/SDMMCSimulationDataGenerator.h
)

target_include_directories(SDMMCAnalyzer PUBLIC
source
${ANALYZER_SDK_INCLUDE_DIR}
)

target_link_libraries(SDMMCAnalyzer
PUBLIC
${ANALYZER_SDK_LIBRARY}
)

target_compile_features(SDMMCAnalyzer
PRIVATE
cxx_nullptr
)

QSPI协议插件

STM8S103

SDCC示例编译

Arduino支持

编程工具

  • stm8flash
  • stm8flash开源工具,是针对stlink硬件烧写的,我这里还是使用openocd+ft2232这样的方式对它进行编程烧写。

其它资源

谢谢支持

  • 微信二维码:

编译Nuttx

Atmel SAM4S Xplained Pro

简介

  • Core

    • ARM Cortex-M4 with 2 Kbytes of cache running at up to 120 MHz
    • Memory Protection Unit (MPU)
    • DSP Instruction Set
    • Thumb ® -2 instruction set
  • Memories

    • Up to 2048 Kbytes embedded Flash with optional dual-bank and cache memory, ECC, Security Bit and Lock Bits
    • Up to 160 Kbytes embedded SRAM
    • 16 Kbytes ROM with embedded boot loader routines (UART, USB) and IAP routines
    • 8-bit Static Memory Controller (SMC): SRAM, PSRAM, NOR and NAND Flash support
  • 同步nuttxnuttx-apps两个源代码,它们两个平级目录.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    ~$ git clone https://github.com/apache/incubator-nuttx nuttx
    ~$ git clone https://github.com/apache/incubator-nuttx-apps.git apps

    ~$ cd nuttx
    # 列出所有支持板子.
    ~$ tools/configure.sh -L | grep "sam"
    ~$ tools/configure.sh -l sam4s-xplained-pro:nsh
    ~$ cp boards/arm/sam34/sam4s-xplained-pro/configs/nsh/defconfig .config
    ~$ cp boards/arm/sam34/sam4s-xplained-pro/scripts/Make.defs .

    # 这里使用一个 第三方的工具链(gcc-arm-none-eabi-6-2017-q2-update) arm-none-eabi- 也可以编译成功.选择 CONFIG_ARMV7M_TOOLCHAIN_GNU_EABIL=y


    ~$ export PATH=/fullpath/gcc-arm-none-eabi-6-2017-q2-update/bin:$PATH
    ~$ make CROSSDEV=arm-none-eabi-

    # 查看交㕚工具链支持的CPU特性.
    ~$ arm-none-eabi-g++ -print-multi-lib
    .;
    thumb;@mthumb
    fpu;@mfloat-abi=hard
    armv6-m;@mthumb@march=armv6s-m
    armv7-m;@mthumb@march=armv7-m
    armv7e-m;@mthumb@march=armv7e-m
    armv7-ar/thumb;@mthumb@march=armv7
    cortex-m7;@mthumb@mcpu=cortex-m7
    armv7e-m/softfp;@mthumb@march=armv7e-m@mfloat-abi=softfp@mfpu=fpv4-sp-d16
    armv7e-m/fpu;@mthumb@march=armv7e-m@mfloat-abi=hard@mfpu=fpv4-sp-d16
    armv7-ar/thumb/softfp;@mthumb@march=armv7@mfloat-abi=softfp@mfpu=vfpv3-d16
    armv7-ar/thumb/fpu;@mthumb@march=armv7@mfloat-abi=hard@mfpu=vfpv3-d16
    cortex-m7/softfp/fpv5-sp-d16;@mthumb@mcpu=cortex-m7@mfloat-abi=softfp@mfpu=fpv5-sp-d16
    cortex-m7/softfp/fpv5-d16;@mthumb@mcpu=cortex-m7@mfloat-abi=softfp@mfpu=fpv5-d16
    cortex-m7/fpu/fpv5-sp-d16;@mthumb@mcpu=cortex-m7@mfloat-abi=hard@mfpu=fpv5-sp-d16
    cortex-m7/fpu/fpv5-d16;@mthumb@mcpu=cortex-m7@mfloat-abi=hard@mfpu=fpv5-d16

  • 下面错误,可以打开源码位置,注释掉它.

1
2
3
4
arch/arm/src/imxrt/Kconfig:1114: syntax error
arch/arm/src/imxrt/Kconfig:1113: invalid option
make: *** [tools/Makefile.unix:471: olddefconfig] Error 1
ERROR: failed to refresh

使用BuildRoot构建指交叉工具链

  • buildroot nuttx
  • 这里是使用NuttX提供的修改后的BuildRoot版本.通过buildroot编译出一个特殊版本的交叉工具链,再用它去编译出最终的nuttx系统镜像.buildroot的源码目录与nuttx,nuttx-apps也是平级的目录.在实践过程中开启了BR2_GCC_CORTEX_M4F_SP导致下面的DBUG里出错的原因.也就是说Atmel SAM4S Xplained Procortex-m4内核,但是它不带FPU,强选FPU肯定是出错的,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
~$ git clone https://bitbucket.org/nuttx/buildroot.git buildroot
~$ cp configs/cortexm4f-eabi-defconfig-4.7.4 .config
# 配置一个合适的版本,这里是:binutils-2.26.1 ,gcc-4.7.4,gdb-8.0.1,因为是cortexm4f-eabi-defconfig-4.7.4,所以这里选择gcc-4.7.4
# 在nuttx配置时,选择 CONFIG_ARMV7M_TOOLCHAIN_BUILDROOT=y,使用第三方如上面是选择 CONFIG_ARMV7M_TOOLCHAIN_GNU_EABIL=y
~$ make menuconfig
~$ make
# 这里最终的配置是如下:
~$ grep -v '^$\|^#' .config
BR2_HAVE_DOT_CONFIG=y
BR2_arm=y
BR2_cortex_m3=y
BR2_GCC_CORTEX=y
BR2_ARM_EABI=y
BR2_ARCH="arm"
BR2_GCC_TARGET_TUNE="cortex-m3"
BR2_GCC_TARGET_ARCH="armv7-m"
BR2_GCC_TARGET_ABI="aapcs-linux"
BR2_WGET="wget --passive-ftp"
BR2_SVN="svn co"
BR2_ZCAT="zcat"
BR2_BZCAT="bzcat"
BR2_TAR_OPTIONS=""
BR2_DL_DIR="$(BASE_DIR)/dl"
BR2_STAGING_DIR="$(BUILD_DIR)/staging_dir"
BR2_NUTTX_DIR="$(TOPDIR)/../nuttx"
BR2_TOPDIR_PREFIX=""
BR2_TOPDIR_SUFFIX=""
BR2_GNU_BUILD_SUFFIX="pc-elf"
BR2_GNU_TARGET_SUFFIX="nuttx-eabi"
BR2_PREFER_IMA=y
BR2_PACKAGE_BINUTILS=y
BR2_BINUTILS_VERSION_2_26_1=y
BR2_BINUTILS_VERSION="2.26.1"
BR2_EXTRA_BINUTILS_CONFIG_OPTIONS=""
BR2_PACKAGE_GCC=y
BR2_GCC_VERSION_4_7_4=y
BR2_GCC_SUPPORTS_SYSROOT=y
BR2_GCC_SUPPORTS_DOWN_PREREQ=y
BR2_GCC_DOWNLOAD_PREREQUISITES=y
BR2_GCC_VERSION="4.7.4"
BR2_EXTRA_GCC_CONFIG_OPTIONS=""
BR2_INSTALL_LIBSTDCPP=y
BR2_PACKAGE_GDB_HOST=y
BR2_GDB_VERSION_8_0_1=y
BR2_PACKAGE_GDB_TUI=y
BR2_GDB_VERSION="8.0.1"
BR2_PACKAGE_GENROMFS=y
BR2_PACKAGE_KCONFIG_FRONTENDS=y
BR2_KCONFIG_VERSION_4_11_0_1=y
BR2_KCONFIG_FRONTENDS_VERSION="4.11.0.1"
BR2_LARGEFILE=y
BR2_TARGET_OPTIMIZATION="-Os -pipe"

# 编译成功后可以使用: arm-nutxx-eabi-g++ -print-multi-lib
BR2_ENABLE_MULTILIB=y
BR2_LARGEFILE=y
BR2_SOFT_FLOAT=y
BR2_TARGET_OPTIMIZATION="-Os -pipe"
  • 如果成功会生成如下的目录.注意,如果开启了BR2_ENABLE_MULTILIB=yBR2_SOFT_FLOAT=y生成的目标目录是build_arm_nofpu,否则生成的目录就是build_arm.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
~$ tree -L 2 build_arm_hf/
build_arm_hf/
├── root
└── staging_dir
├── arm-elf -> arm-nuttx-eabi
├── arm-nuttx-eabi
├── bin
├── include
├── lib
├── libexec
├── share
└── usr

10 directories, 0 files
  • 测试工具链
1
2
3
4
5
~$ export PATH=`pwd`/build_arm_hr/staging_dir/bin:$PATH
~$ arm-nuttx-eabi-g++ -print-multi-lib
.;
thumb;@mthumb
fpu;@mfloat-abi=hard
  • 如果出现下面的错误,在make menuconfig选择一个高版本的gdb尝试一下.
    1
    2
    3
    4
    /fullpath/buildroot/toolchain_build_arm_hf/gdb-7.9.1/gdb/python/python.c: In function ‘_initialize_python’:
    /fullpath/buildroot/toolchain_build_arm_hf/gdb-7.9.1/gdb/python/python.c:1690:3: error: too few arguments to function ‘_PyImport_FixupBuiltin’
    _PyImport_FixupBuiltin (gdb_module, "_gdb");

  • 如果出现gcc编译时无法下载mpc,mpfr,gmp的依赖包时,查看一下toolchain_build_arm_hf/gcc-4.9.4/contrib/download_prerequisites.修改版本号,或者下载的地址.

gcc-4.7.4的补丁

  • 在构建交㕚工具链时如果出现下面的错误
1
2
3
4
In file included from .../gcc-4.7.4/gcc/cp/except.c:990:0:
cfns.gperf: At top level:
cfns.gperf:101:1: error: 'gnu_inline' attribute present on 'libc_name_p'
cfns.gperf:26:14: error: but not here
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
~$ cat > toolchain/gcc/4.7.4/gnu_inline.patch <<EOF
diff --git a/gcc/cp/cfns.gperf b/gcc/cp/cfns.gperf
index 68acd3d..953262f 100644
--- a/gcc/cp/cfns.gperf
+++ b/gcc/cp/cfns.gperf
@@ -22,6 +22,9 @@ __inline
static unsigned int hash (const char *, unsigned int);
#ifdef __GNUC__
__inline
+#ifdef __GNUC_STDC_INLINE__
+__attribute__ ((__gnu_inline__))
+#endif
#endif
const char * libc_name_p (const char *, unsigned int);
%}
diff --git a/gcc/cp/cfns.h b/gcc/cp/cfns.h
index 1c6665d..6d00c0e 100644
--- a/gcc/cp/cfns.h
+++ b/gcc/cp/cfns.h
@@ -53,6 +53,9 @@ __inline
static unsigned int hash (const char *, unsigned int);
#ifdef __GNUC__
__inline
+#ifdef __GNUC_STDC_INLINE__
+__attribute__ ((__gnu_inline__))
+#endif
#endif
const char * libc_name_p (const char *, unsigned int);
/* maximum key range = 391, duplicates = 0 */
EOF

GCC-4.7.4的编译错误

1
2
3
4
5
6
7
8
9
10
make[4]: Entering directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build/libiberty/testsuite'
make[4]: Nothing to be done for 'install'.
make[4]: Leaving directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build/libiberty/testsuite'
make[3]: Leaving directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build/libiberty'
/bin/bash: line 3: cd: arm-nuttx-eabi/libgcc: No such file or directory
make[2]: *** [Makefile:10334: install-target-libgcc] Error 1
make[2]: Leaving directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build'
make[1]: *** [Makefile:2115: install] Error 2
make[1]: Leaving directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build'
make: *** [toolchain/gcc/gcc-nuttx-4.x.mk:159: /fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build/.installed] Error 2
  • 编译时出现上面的错误,这好像是这个GCC版本的问题,在4.9.x好像没有出这样的错误,同是查看了各级的config.log与各级的Makefile也没有找出问题,后来只有进入到编译的目录内toolchain_build_arm/gcc-4.7.4-build内直接运行make,无错的话再回到buildroot内,再运行make这里就正常完成了.

配置NuttX系统

1
2
3
4
5
6
7
8
9
10
~$ tools/configure.sh -l  sam4s-xplained-pro:nsh
~$ cp boards/arm/sam34/sam4s-xplained-pro/configs/nsh/defconfig .config
~$ cp boards/arm/sam34/sam4s-xplained-pro/scripts/Make.defs .
~$ make menuconfig
~$ make
[...]
LD: nuttx
make[1]: Leaving directory '/fullpath/nuttx/arch/arm/src'
CP: nuttx.hex
CP: nuttx.bin
  • Nuttx的最终配置如下:
1
~$ grep -v '^$\|^#' .config
  • 编译时出现下面错误,这个有可能是配置时没有选择CONFIG_ARCH_IRQPRIO=y,然后就去make后的结果.通过搜索nuttx源码时发现,getcontrol(void)是定义在arch/arm/include/armv7-m/irq.h里的内联函数.最终在arch/arm/src/chip/sam_start.c前面,添加#include <nuttx/irq.h>就可以了.
1
2
3
/fullpath/nuttx/staging/libarch.a(sam_start.o): In function `sam_fpuconfig':
/fullpath/nuttx/arch/arm/src/chip/sam_start.c:159: undefined reference to `getcontrol`
/fullpath/nuttx/arch/arm/src/chip/sam_start.c:161: undefined reference to `setcontrol`

使用OpenOCD调试

编译OpenOCD

1
2
3
4
~$ git clone http://repo.or.cz/r/openocd.git
~$ cd openocd && mkdir build-linux && cd build-linux
~$ ../configure --enable-ftdi --enable-stlink --enable-ti-icdi --enable-ulink --enable-usb-blaster-2 --enable-ft232r --enable-xds110 --enable-usbprog --enable-armjtagew --enable-cmsis-dap --enable-usb-blaster --enable-openjtag --enable-jlink --enable-bcm2835gpio --enable-imx_gpio --enable-oocd_trace --enable-buspirate --enable-sysfsgpio
~$ make && make install
  • 根据Nuttx官方的Debugging Nuttx提示,基于某些特性调试,可能需要修改相应的openocd/src/rtos/nuttx_header.h内的宏定义如:CONFIG_DISABLE_MQUEUE=y.

  • 运行OpenOCD服务,通过PC端的USB连接到Atmel-SAM4S_Xpained-Pro上写有DEBUG USB的接口上,注意,该USB接口是板上调试器(EDBG)的组合接口,它包含三个接口功能:DEBUG,Virtual COM Port, Data Gateway Interface(DGI).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# 这里如果定义板级配置也可直接: openocd -f board/atmel_sam4s_xplained_pro.cfg -c init -c "reset halt"
~$ openocd -f interface/cmsis-dap.cfg -f target/at91sam4sd32x.cfg -c init -c "reset halt"
Open On-Chip Debugger 0.10.0+dev-01408-g762ddcb74-dirty (2020-09-25-00:32)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "swd". To override use 'transport select <transport>'.
Info : CMSIS-DAP: SWD Supported
Info : CMSIS-DAP: JTAG Supported
Info : CMSIS-DAP: FW Version = 1.0
Info : CMSIS-DAP: Serial# = ATML1803040200001055
Info : CMSIS-DAP: Interface Initialised (SWD)
Info : SWCLK/TCK = 1 SWDIO/TMS = 1 TDI = 1 TDO = 1 nTRST = 0 nRESET = 1
Info : CMSIS-DAP: Interface ready
Info : clock speed 500 kHz
Info : SWD DPIDR 0x2ba01477
Info : sam4.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for sam4.cpu on 3333
Info : Listening on port 3333 for gdb connections
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x00400554 msp: 0x200034d0
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : accepting 'telnet' connection on tcp/4444
Info : dropped 'telnet' connection
Info : accepting 'telnet' connection on tcp/4444
Info : dropped 'telnet' connection
Info : accepting 'gdb' connection on tcp/3333
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x00400554 msp: 0x200034d0

  • 使用GDB桥接到OpenOCD服务上去调试.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
~$ arm-nuttx-eabi-gdb nuttx
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-elf --target=arm-nuttx-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from nuttx...done.
(gdb) target extended-remote :3333
0x00400554 in arm_earlyserialinit () at chip/sam_serial.c:1345
1345 up_disableallints(TTYS1_DEV.priv, NULL);
(gdb) load # 加载到板子上去,也就烧写入板子.
# OpenOCD的终端上也有相应的信息输出.
Loading section .text, size 0x19003 lma 0x400000
Loading section .ARM.extab, size 0x30 lma 0x419004
Loading section .ARM.exidx, size 0xd0 lma 0x419034
Loading section .data, size 0x220 lma 0x419104
Start address 0x4000cc, load size 103203
Transfer rate: 18 KB/sec, 10320 bytes/write.
(gdb) cont #继续运行,这里出现异常了,这人错误的示例,是因为工具链选择了`BR2_GCC_CORTEX_M4F_SP=y`,与目标板子不匹配,SAM4S-XPRO它是Cortex-m4但是它没有FPU.
Continuing.
sam4.cpu -- clearing lockup after double fault

Program received signal SIGINT, Interrupt.
exception_common () at armv7-m/gnu/arm_exception.S:176
176 vstmdb sp!, {s16-s31} /* Save the non-volatile FP context */
(gdb) i r # 详细GDB的指令,参考该版本的说明文档,最权威,最详细.
r0 0x3 3
r1 0x1 1
r2 0x20001e78 536878712
[....]

  • 也可以把几个参数合在一行命令下提交,如arm-nuttx-eabi-gdb -ex "target remote :3333" -ex "mon reset halt" nuttx

  • 关于类似ATSAM4SD32C.cpu -- clearing lockup after double fault,参考了stackexchange处理.

  • 查看目标文件的内容.

1
2
3
4
5
6
~$ file nuttx
nuttx: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped
~$ file nuttx.bin
nuttx.bin: data
~$ file nuttx.hex
nuttx.hex: ASCII text, with CRLF line terminators

GDB调试

OpenOCD服务

1
2
3
# ~$ openocd -f board/atmel_sam4s_xplained_pro.cfg -c init -c 'reset halt' -c '$_TARGETNAME configure -rtos nuttx'
~$ openocd -f board/atmel_sam4s_xplained_pro.cfg -c '$_TARGETNAME configure -rtos nuttx'

GDB连接

  • nuttx的目录下运行,为了加载nuttx文件.
1
~ nuttx$ arm-none-eabi-gdb -ex "target remote :3333" -ex "mon reset halt"  nuttx
  • 定义一些gdb hook函数,只是为了方便一些调试,最好的方式是把这些define放在~/.gdbinit内, 但是要注意,你如果在用系统gdb去调非nuttx的程序时,
    记得要删了~/.gdbinit.
1
2
3
4
5
6
7
8
9
(gdb) define hookpost-file
Type commands for definition of "hookpost-file".
End with a line saying just "end".
> eval "monitor nuttx.pid_offset %d", &((struct tcb_s *)(0))->pid
> eval "monitor nuttx.xcpreg_offset %d", &((struct tcb_s *)(0))->xcp.regs
> eval "monitor nuttx.state_offset %d", &((struct tcb_s *)(0))->task_state
> eval "monitor nuttx.name_offset %d", &((struct tcb_s *)(0))->name
> eval "monitor nuttx.name_size %d", sizeof(((struct tcb_s *)(0))->name)
>end
  • 连接远程openocd的端口
1
2
3
4
5
6
(gdb) target extended-remote :3333
Remote debugging using :3333
__start () at chip/sam_start.c:269
269 {
(gdb) file nuttx

  • 上线的file nuttx是加载文件的symbols,因为定义了hook-file,所以会在openocd内显示,如:
1
2
3
4
5
6
7
Error: No symbols for NuttX  # 有可能会显示,如果每次都显示,是因为没有打开`CONFIG_DEBUG_SYMBOLS`
Info : pid_offset: 12
Info : xcpreg_offset: 132
Info : state_offset: 26
Info : name_offset: 208
Info : name_size: 16

  • 查看调试线程信息.
1
2
3
4
(gdb) info threads
warning: while parsing threads: not well-formed (invalid token)
Id Target Id Frame
* 1 Remote target __start () at chip/sam_start.c:269
  • 查看寄存器的信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
(gdb) info registers
r0 0x0 0
r1 0x20004360 536888160
r2 0x20004360 536888160
r3 0x1 1
r4 0xc 12
r5 0x200010cc 536875212
r6 0x200010cc 536875212
r7 0x3 3
r8 0x0 0
r9 0x0 0
r10 0x0 0
r11 0x0 0
r12 0x20004290 536887952
sp 0x20004340 0x20004340
lr 0x8012621 134293025
pc 0x8003b4c 0x8003b4c <memcpy+20>
xPSR 0x61000000 1627389952

设置断点

  • Breakpoints

  • 下面设置的断点是在文件的某行上面,为防止调试时程序跑飞,硬件重启后又需要重新设置断点,这里保存断点到bp.txt文件上,下次可以使用source bp.txt加载.

1
2
3
4
5
6
7
8
9
10
11
12
(gdb) b chip/sam_hsmci.c:757
Breakpoint 1 at 0x417cf8: file chip/sam_hsmci.c, line 757.
(gdb) b mmcsd/mmcsd_sdio.c:2780
Breakpoint 2 at 0x415fc4: file mmcsd/mmcsd_sdio.c, line 2780.
(gdb) save breakpoints bp.txt
Saved to file 'bp.txt'.
(gdb) rbreak bp.txt
(gdb) info b
Num Type Disp Enb Address What
1 breakpoint keep y 0x00417cf8 in sam_clock at chip/sam_hsmci.c:757
2 breakpoint keep y 0x00415fc4 in mmcsd_probe at mmcsd/mmcsd_sdio.c:2780

  • 查看调用栈,有时调试板子,在板子初始硬件时,串口打印还没有就绪时,此时用backtrace查看调用栈很有用,比如:晶振不起振.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
(gdb) backtrace
#0 sam_clock (dev=0x2000016c <g_sdiodev>, rate=CLOCK_SD_TRANSFER_1BIT)
at chip/sam_hsmci.c:1580
#1 0x00415e56 in mmcsd_sdinitialize (priv=0x200043b0) at mmcsd/mmcsd_sdio.c:3023
#2 mmcsd_probe (priv=priv@entry=0x200043b0) at mmcsd/mmcsd_sdio.c:3465
#3 0x004162d2 in mmcsd_mediachange (arg=0x200043b0) at mmcsd/mmcsd_sdio.c:2545
#4 0x004185f2 in sdio_mediachange (dev=0x2000016c <g_sdiodev>, cardinslot=<optimized out>)
at chip/sam_hsmci.c:2799
#5 0x004148e4 in sam_hsmci_initialize () at sam_hsmci.c:180
#6 0x0041478a in board_app_initialize (arg=arg@entry=0) at sam_appinit.c:129
#7 0x004120fa in boardctl (cmd=cmd@entry=65281, arg=arg@entry=0) at boardctl.c:326
#8 0x00405cfa in nsh_initialize () at nsh_init.c:103
#9 0x00405ccc in nsh_main (argc=1, argv=0x200059c8) at nsh_main.c:143
#10 0x00403858 in nxtask_startup (entrypt=0x405ca9 <nsh_main>, argc=1, argv=0x200059c8)
at sched/task_startup.c:165
#11 0x00401320 in nxtask_start () at task/task_start.c:144
#12 0x00000000 in ?? ()

  • 单步(s = Step into, n = Step over),单步步入(Step Into)有时会进入很深的调用栈,如果要退回可以使用finish指令跳出.
1
2
3
4
5
6
7
(gdb) step
100 ret = nxsig_nanosleep(&rqtp, &rmtp);
(gdb) stepi
nxsig_nanosleep (rqtp=rqtp@entry=0x20004330, rmtp=rmtp@entry=0x20004338) at signal/sig_nanosleep.c:108
108 {
(gdb) n
116 if (rqtp == NULL || rqtp->tv_nsec < 0 || rqtp->tv_nsec >= 1000000000)
  • 查看变量
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
btle_main (argc=<error reading variable: value has been optimized out>, argv=<error reading variable: value has been optimized out>) at nrf24l01_btle.c:372
372 memcpy(chunk(buffer,pls)->data,&hum,2);
(gdb) p hum
$5 = 1844
(gdb) p temp
$6 = 3139
(gdb) p buffer.playload
There is no member named playload.
(gdb) x buffer.payload
0x200010d4 <buffer+8>: 0x06000102
(gdb) x/32b buffer.payload
0x200010d4 <buffer+8>: 2 1 0 6 9 0 0 0
0x200010dc <buffer+16>: 0 0 7 22 0 0 0 0
0x200010e4 <buffer+24>: 0 0 40 7 -56 0 0 0
0x200010ec <current>: 0 0 0 0 0 0 0 0
(gdb) x/32x buffer.payload # hex格式数组.
0x200010d4 <buffer+8>: 0x02 0x01 0x00 0x06 0x09 0x00 0x00 0x00
0x200010dc <buffer+16>: 0x00 0x00 0x07 0x16 0x00 0x00 0x00 0x00
0x200010e4 <buffer+24>: 0x00 0x00 0x28 0x07 0xc8 0x00 0x00 0x00
0x200010ec <current>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) print /x buffer.payload # 使用打印查看数组.
$2 = {0x2, 0x1, 0x0, 0x6, 0x9, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7, 0x16, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x28, 0x7, 0xc8, 0x0, 0x0, 0x0}
  • hexdump 查看地址,数组
1
2
3
4
5
6
(gdb) x /32bx data
0x20004374: 0x44 0x12 0x00 0x20 0x20 0x00 0x00 0x00
0x2000437c: 0x95 0x3a 0x01 0x08 0x44 0x12 0x00 0x20
0x20004384: 0x03 0x00 0x00 0x00 0x04 0x00 0x00 0x00
0x2000438c: 0x07 0x5d 0x01 0x08 0x10 0x00 0x00 0x00

  • 查看结构体的内容
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(gdb) set print pretty on
(gdb) p dev
$4 = (struct nrf24l01_dev_s *) 0x20003370
(gdb) p *dev
$5 = {
spi = 0x20000174 <g_spi2dev>,
config = 0x20000140 <nrf_cfg>,
state = ST_STANDBY,
tx_payload_noack = 0 '\000',
en_aa = 63 '?',
en_pipes = 1 '\001',
ce_enabled = 0 '\000',
lastxmitcount = 0 '\000',
addrlen = 5 '\005',
pipedatalen = "!\000\000\000\000",
pipe0addr = "\001\312\376\022\064",
last_recvpipeno = 0 '\000',
sem_tx = {
semcount = 0
},
tx_pending = 1 '\001',
rx_fifo = 0x200033d0 "h\020",
fifo_len = 0,
nxt_read = 0,
nxt_write = 0,
[.....]

烧写固件

  • 烧写flash0,地址0x00400000在链接脚本文件boards/arm/sam34/sam4s-xplained-pro/scripts/sam4s-xplained-pro.ld里有定义.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
~$ openocd -f board/atmel_sam4s_xplained_pro.cfg -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x00400000" -c "reset"
Open On-Chip Debugger 0.10.0+dev-01408-g762ddcb74-dirty (2020-09-25-00:32)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "swd". To override use 'transport select <transport>'.
Info : CMSIS-DAP: SWD Supported
Info : CMSIS-DAP: JTAG Supported
Info : CMSIS-DAP: FW Version = 1.0
Info : CMSIS-DAP: Serial# = ATML1803040200001055
Info : CMSIS-DAP: Interface Initialised (SWD)
Info : SWCLK/TCK = 1 SWDIO/TMS = 1 TDI = 1 TDO = 1 nTRST = 0 nRESET = 1
Info : CMSIS-DAP: Interface ready
Info : clock speed 500 kHz
Info : SWD DPIDR 0x2ba01477
Info : ATSAM4SD32C.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for ATSAM4SD32C.cpu on 3333
Info : Listening on port 3333 for gdb connections
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x004000cc msp: 0x20001ee4
Info : sam4 does not auto-erase while programming (Erasing relevant sectors)
Info : sam4 First: 0x00000000 Last: 0x0000000c
Info : Erasing sector: 0x00000000
Info : Erasing sector: 0x00000001
Info : Erasing sector: 0x00000002
Info : Erasing sector: 0x00000003
Info : Erasing sector: 0x00000004
Info : Erasing sector: 0x00000005
Info : Erasing sector: 0x00000006
Info : Erasing sector: 0x00000007
Info : Erasing sector: 0x00000008
Info : Erasing sector: 0x00000009
Info : Erasing sector: 0x0000000a
Info : Erasing sector: 0x0000000b
Info : Erasing sector: 0x0000000c
auto erase enabled
wrote 106496 bytes from file nuttx.bin in 5.418800s (19.192 KiB/s)

  • 连接到它的UART接口,进入系统.
1
2
3
~$ sudo minicom -o -b 115200 -D /dev/ttyACM0
NuttShell (NSH) NuttX-9.1.0
nsh> mm #测试一下内存.

NAND 移植

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
sam_nand_initialize: CS0
nand_initialize: cmdaddr=0x60400000 addraddr=0x60200000 dataaddr=0x60000000
onfi_ebidetect: cmdaddr=60400000 addraddr=60200000 dataaddr=60000000
onfi_read: cmdaddr=60400000 addraddr=60200000 dataaddr=60000000
onfi_read: Returning:
onfi_read: manufacturer: 0x2c
onfi_read: buswidth: 0
onfi_read: luns: 1
onfi_read: eccsize: 4
onfi_read: model: 0x @
onfi_read: sparesize: 64
onfi_read: pagesperblock: 64
onfi_read: blocksperlun: 2048
onfi_read: pagesize: 2048
nand_initialize: Found ONFI compliant NAND FLASH
nand_devscan: Retrieving bad block information. nblocks=2048
  • 读/写页(page),擦除块(block).在概念上,由大到小来说,就是:
1
Nand Flash ⇒ Chip ⇒ Plane ⇒ Block ⇒ Page ⇒ oob
1
2
3
4
5
6
7
8
9
10
 hexdump /dev/mtdblock0 count=128
/dev/mtdblock0 at 00000000:
0000: eb 3c 90 4e 55 54 54 58 20 20 20 00 08 20 01 00 .<.NUTTX .. ..
0010: 02 00 02 00 00 f8 05 00 3f 00 ff 00 00 00 00 00 ........?.......
0020: 00 00 02 00 00 00 29 00 00 00 00 20 20 20 20 20 ......)....
0030: 20 20 20 20 20 20 46 41 54 31 36 20 20 20 0e 1f FAT16 ..
0040: be 5b 7c ac 22 c0 74 0b 56 b4 0e bb 07 00 cd 10 .[|.\".t.V.......
0050: 5e eb f0 32 e4 cd 16 cd 19 eb fe 54 68 69 73 20 ^..2.......This
0060: 69 73 20 6e 6f 74 20 61 20 62 6f 6f 74 61 62 6c is not a bootabl
0070: 65 20 64 69 73 6b 2e 20 20 50 6c 65 61 73 65 20 e disk. Please
  • 格式化nuttx代码
    1
    git status | grep "modified:" | awk '{print $2}' | xargs tools/checkpatch.sh -f
  • 内存分布SAMA5D3 Series是参照Datasheet第5章Memories. SAM4S是SAM4S Datasheet Chapter 6 . Product Mapping

Arduino Due

  • Arduino Due
  • Hacking with the Arduino Due
  • SAM3X-Arduino Pin Mapping
  • The Arduino Due is a microcontroller board based on the Atmel SAM3X8E ARM Cortex-M3 CPU. It is the first Arduino board based on a 32-bit ARM core microcontroller. It has 54 digital input/output pins (of which 12 can be used as PWM outputs), 12 analog inputs, 4 UARTs (hardware serial ports), a 84 MHz clock, an USB OTG capable connection, 2 DAC (digital to analog), 2 TWI, a power jack, an SPI header, a JTAG header, a reset button and an erase button.
  • 根据官方警示:Arduino Due的管脚只容3.3v,高于3.3v会损坏板子.

BOSSAC烧写

  • 这里将使用它官方的烧写工具BOSSAC,它一般位于用户目录下.如:~/.arduino15/packages/arduino/tools/bossac/1.7.0-arduino3/bossac.

检测目标板子信息

1
2
~$  bossac -p ttyACM0 -U false -i
No device found on ttyACM0
  • 需要设置正确的端口参数,如果再不行,按reset再重试,如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
~$ stty -F /dev/ttyACM0 speed 1200 cs8 -cstopb -parenb
115200
~ $ bossac -p ttyACM0 -U false -i
Atmel SMART device 0x285e0a60 found
Device : ATSAM3X8
Chip ID : 285e0a60
Version : v1.1 Dec 15 2010 19:25:04
Address : 524288
Pages : 2048
Page Size : 256 bytes
Total Size : 512KB
Planes : 2
Lock Regions : 32
Locked : none
Security : false
Boot Flash : false

烧入nuttx.bin

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ bossac -p ttyACM0 -U false -e -w -v -b nuttx.bin -R
Atmel SMART device 0x285e0a60 found
Erase flash
done in 0.041 seconds

Write 62368 bytes to flash (244 pages)
[==============================] 100% (244/244 pages)
done in 12.866 seconds

Verify 62368 bytes of flash
[==============================] 100% (244/244 pages)
Verify successful
done in 12.240 seconds
Set boot flash true
CPU reset.

使用SWD烧写

  • 除了使用BOSSAC,本来想使用SWD接口与OpenOCD来烧写调试.因为Due板有一个排4针的DEBUG接口,针脚是:1:RESET,2:SWDIO,3:SWCLK,4:GND.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    ~$ cat > ~/sam3x8e.cfg<<EOF
    source [find interface/stlink.cfg]

    set CPUTAPID 0x2ba01477

    source [find board/atmel_sam3x_ek.cfg]
    EOF

    ~$ openocd -f ~/sam3x8e.cfg -c init -c halt -c "flash write_image erase nuttx.bin 0x80000" -c "at91sam3 gpnvm set 1" -c "exit"

使用板载AT16u2烧写

  • at16u2_cmsis_dap

  • 在网上发现可以修改AT16u2的固件,使它成为CMSIS_DAP,从而可以支持openocd的烧写调试.

更新AT16u2的固件

  • Upgrading16U2Due

  • 烧写AVR芯片需要一个烧写器,如:avr JATG-ICE, AVR-ISP,Atmel-ICE,USBasp.也可以把一块arduino板子变成AVR-ISP.如:Arduino Uno. 但是这里有一块NUCLEO-L152RE它有兼容arduino的接口,可以使用stm32duino把它变成AVR-ISP烧写器,烧入官方的ArduinoISP进去.

  • 打开Arduino IDE --> File --> examples(Builtin-examples) --> 11.ArduinoISP --> ArduinoISP,烧写上传到NUCLEO-L152RE.

1
2
3
4
5
6
7
NUCLEO-L152RE        Arduino Due (ICSP)
D10 CS <------> Reset
D11 MOSI <------> MOSI
D12 MISO <------> MISO
D13 SCK <------> SCK
GND <------> GND
+5V <------> +5V

avrdude

  • 下载avrdude源码avrdude是一个开源的AVR烧写软件,它最新地板本是avrdude-6.3,通过源码可以查看它所支持的硬件详情.

  • 下面是使用NUCLEO-L152RE做为一个烧写器,按照上面接线,对Arduino Due板上的at16u2的固件进行更新.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    ~$ cd   ~/.arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/
    ~$ tree
    .
    ├── bin
    │   └── avrdude
    └── etc
    └── avrdude.conf

    ~$ avrdude -c arduino -P /dev/ttyACM0 -b 19200 -p atmega16u2 -vvv -U flash:w:at16u2_cmsis_dap/at16u2_cmsis_dap.elf.hex:i

    avrdude: Version 6.3-20171130
    Copyright (c) 2000-2005 Brian Dean, http://www.bdmicro.com/
    Copyright (c) 2007-2014 Joerg Wunsch

    System wide configuration file is "/etc/avrdude.conf"
    User configuration file is "/home/michael/.avrduderc"
    User configuration file does not exist or is not a regular file, skipping

    Using Port : /dev/ttyACM0
    Using Programmer : arduino
    Overriding Baud Rate : 19200
    AVR Part : ATmega16U2
    Chip Erase delay : 9000 us
    PAGEL : PD7
    BS2 : PC6
    RESET disposition : possible i/o
    RETRY pulse : SCK
    serial program mode : yes
    parallel program mode : yes
    Timeout : 200
    StabDelay : 100
    CmdexeDelay : 25
    SyncLoops : 32
    ByteDelay : 0
    PollIndex : 3
    PollValue : 0x53
    Memory Detail :

    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    eeprom 65 20 4 0 no 512 4 128 9000 9000 0x00 0x00
    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    flash 65 6 128 0 yes 16384 128 128 4500 4500 0x00 0x00
    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    lfuse 0 0 0 0 no 1 0 0 9000 9000 0x00 0x00
    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    hfuse 0 0 0 0 no 1 0 0 9000 9000 0x00 0x00
    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    efuse 0 0 0 0 no 1 0 0 9000 9000 0x00 0x00
    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    lock 0 0 0 0 no 1 0 0 9000 9000 0x00 0x00
    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    calibration 0 0 0 0 no 1 0 0 0 0 0x00 0x00
    Block Poll Page Polled
    Memory Type Mode Delay Size Indx Paged Size Size #Pages MinW MaxW ReadBack
    ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
    signature 0 0 0 0 no 3 0 0 0 0 0x00 0x00

    Programmer Type : Arduino
    Description : Arduino
    Hardware Version: 2
    Firmware Version: 1.18
    Topcard : Unknown
    Vtarget : 0.0 V
    Varef : 0.0 V
    Oscillator : Off
    SCK period : 0.1 us

跳线

  • 因为Due板上的GND,RESET已经连接到AT16u2上了,这里Due板内只需两根跳线就够了.所以此时的at16u2就是一个CMSIS-DAP的设备了.
1
2
3
4
 ICSP            DEBUG(SWD)

SCK (3) <-----> SCK (2)
MOSI(4) <-----> SDO (3)
  • 使用OpenOCD烧写.
1
~$ openocd -f interface/cmsis-dap.cfg -f board/atmel_sam3x_ek.cfg  -c init -c halt -c "flash write_image erase nuttx.bin 0x80000" -c "at91sam3 gpnvm set 1" -c "exit"

使用UM232H(FTDI)做烧写器

  • 上面的做法是使用一块arduino板做为烧写器,这里是使用FT232H USB to SPI/I2C的小板来做为烧写器,连接板上的DEBUG(SWD).
1
2
3
4
5
6
7
8
9
10
# FT232HQ minimodule channel 0 (Channel A)
# Connector FTDI Arduino Due(SWD)
# Pin Name
# --------- ------ ------
# CN2-10 GND GND (pin1)
# CN2-13 ADBUS0 (TCK) SWCLK (pin2)
# CN2-14 ADBUS2 (TDI/TDO) SWDIO (pin3)
# CN2-15 ADBUS1 (TDO/TDI) SWDIO (pin3)
# CN2-17 ADBUS4 (GPIOL0) nTRST (pin4)

  • 也可以使用Due板上的标准的ARM-JTAG-10pin去调试,接线的方式如同上面,可以使用JTAG信号,也可以只接SWD信号.如果转接到一个20Pin的板上,就需要对接相应的信号线.

  • 使用OpenOCD烧写,如果板上原来是arduino镜像,需要按下reset后,马上运行下面命令.

1
~$ openocd -f interface/ftdi/ft232h-module-swd.cfg  -f board/atmel_sam3x_ek.cfg  -c init -c halt -c "flash write_image erase nuttx.bin 0x80000" -c "at91sam3 gpnvm set 1" -c "exit"

OpenOCD连接

其它

  • Write code to FLASH don’t change boot mode and don’t reset. This lets
    you examine the FLASH contents that you just loaded while the bootloader
    is still active.
1
2
3
4
5
6
~$ bossac.exe --port=COM26 --usb-port=false -e -w -v --boot=0 nuttx.bin
Write 64628 bytes to flash
[==============================] 100% (253/253 pages)
Verify 64628 bytes of flash
[==============================] 100% (253/253 pages)
Verify successful
  • Verify the FLASH contents (the bootloader must be running)
1
2
3
4
~$ bossac.exe --port=COM26 --usb-port=false -v nuttx.bin
Verify 64628 bytes of flash
[==============================] 100% (253/253 pages)
Verify successful
  • Read from FLASH to a file (the bootloader must be running):
1
2
3
~$ bossac.exe --port=COM26 --usb-port=false --read=4096 nuttx.dump
Read 4096 bytes from flash
[==============================] 100% (16/16 pages)
  • Change to boot from FLASH
1
2
~$ bossac.exe --port=COM26 --usb-port=false --boot=1
Set boot flash true

恢复AT16u2的固件

  • ArduinoCore-sam

  • 下面是把Arduino Due板上的AT16u2恢复它原来的固件功能,这里是使用FT232H连它的ICSP而不是用一个Arduino板来做烧写器.arduino它所有支持的固件都在它的安装包内,因为Arduino DueSAM3X8E的芯片,这里选择查看~/.arduino15/packages/arduino/hardware/sam/目录.如下.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
~$ tree ~/.arduino15/packages/arduino/hardware/sam/1.6.12/firmwares
.arduino15/packages/arduino/hardware/sam/1.6.12/firmwares
└── atmega16u2
├── Arduino-DUE-usbserial-prod-firmware-2012-11-05.hex
├── Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex
└── arduino-usbserial
├── Arduino-usbserial.c
├── Arduino-usbserial.h
├── Board
│   └── LEDs.h
├── Descriptors.c
├── Descriptors.h
├── Lib
│   └── LightweightRingBuff.h
├── makefile
└── readme.txt

4 directories, 10 files

  • 关于如何接线的问题,可以查看/etc/avrdude.conf,根据里面的注释指导,结合自已板子接线.
1
2
3
4
5
6
7
8
9
10

FT232H Arduino Due (ICSP)

pin15 ADBUS2 <------> MISO pin1
+5V <------> +5V 2
pin13 ADBUS0 <------> SCK 3
pin14 ADBUS1 <------> MOSI 4
pin16 ADBUS3 <------> Reset 5
GND <------> GND 6

  • 烧写命令如下,如有问题可以,添加-vvv烧写查看.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB0 -b 19200 -p atmega16u2 -U flash:w:Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex:i

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9489 (probably m16u2)
avrdude: NOTE: "flash" memory has been specified, an erase cycle will be performed
To disable this feature, specify the -D option.
avrdude: erasing chip
avrdude: reading input file "Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex"
avrdude: writing flash (4314 bytes):

Writing | ################################################## | 100% 7.56s

avrdude: 4314 bytes of flash written
avrdude: verifying flash memory against Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex:
avrdude: load data flash data from input file Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex:
avrdude: input file Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex contains 4314 bytes
avrdude: reading on-chip flash data:

Reading | ################################################## | 100% 7.28s

avrdude: verifying ...
avrdude: 4314 bytes of flash verified

avrdude: safemode: Fuses OK (E:F4, H:D9, L:FF)

avrdude done. Thank you.

谢谢支持

  • 微信二维码:

处理CSV

  • 示例文件的如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ head  TFWP_2020Q1_Positive_EN.csv
"Employers Who Were Issued a Positive Labour Market Impact Assessment (LMIA) by Program Stream, National Occupational Classification (NOC) 2011 and Business Location, January to March 2020",,,,,
Province/Territory,Program Stream,Employer ,Address,Occupation,Approved Positions
Newfoundland and Labrador, High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0","2273-Deck officers, water transport",4
Newfoundland and Labrador, High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0","2274-Engineer officers, water transport",4
Newfoundland and Labrador, High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0",7242-Industrial electricians,1
Newfoundland and Labrador, High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0",7532-Water transport deck and engine room crew,9
Newfoundland and Labrador, High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0",7612-Other trades helpers and labourers,1
Newfoundland and Labrador, High Wage,Bailey Veterinary Surgical Specialty Ltd.,"St. John's, A1N3J7",3114-Veterinarians,1
Newfoundland and Labrador, High Wage,Eastern Regional Health Authority,"Mount Pearl, A1N3J5",3111-Specialist physicians,1
Newfoundland and Labrador, High Wage,WesTower Communications Ltd.,"St. John's, A1A5G6",7245-Telecommunications line and cable workers,2

[....]

  • 如上面所示,第一行应该算是一个标题,第二行是CSV文件的列字段,以逗号间隔.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
In [34]: import pandas as pd
# 这里跳过了第一行的读取,也可以使用如 skiprows=1,nrows=8814,从第二开始读取8814行.
In [35]: a = pd.read_csv("/home/michael/Documents/TFWP_2020Q1_Positive_EN.csv",encoding="latin",skiprows=1)
In [37]: a.head()
Out[37]:
Province/Territory Program Stream Employer Address Occupation Approved Positions
0 Newfoundland and Labrador High Wage Anglo Eastern Ship Managment Ltd Wanchai, A0A0A0 2273-Deck officers, water transport 4.0
1 Newfoundland and Labrador High Wage Anglo Eastern Ship Managment Ltd Wanchai, A0A0A0 2274-Engineer officers, water transport 4.0
2 Newfoundland and Labrador High Wage Anglo Eastern Ship Managment Ltd Wanchai, A0A0A0 7242-Industrial electricians 1.0
3 Newfoundland and Labrador High Wage Anglo Eastern Ship Managment Ltd Wanchai, A0A0A0 7532-Water transport deck and engine room crew 9.0
4 Newfoundland and Labrador High Wage Anglo Eastern Ship Managment Ltd Wanchai, A0A0A0 7612-Other trades helpers and labourers 1.0

# 如:直接输出成HTML或EXCEL.
In [38]: a.to_html("name.html")

In [39]: a.to_excel("name.xlsx")

  • 过滤一些特定字段,如下面,查找出Occupation字段中,以2175-Web开头的所有行记录.
1
2
3
4
5
6
7
8
9
10
11
12
a[a.Occupation.str.startswith('2175-Web')]
Out[85]:
Province/Territory Program Stream Employer Address Occupation Approved Positions
53 Nova Scotia High Wage 10094277 Canada Inc Halifax, B3J2T9 2175-Web designers and developers 2
124 Nova Scotia Global Talent Stream 3rDi Laboratory Inc. Wolfville, B4P3R6 2175-Web designers and developers 2
207 Quebec High Wage 213A Studio Créatif Inc. Montréal, H2S3X3 2175-Web designers and developers 1
249 Quebec High Wage 9122-4790 Québec Inc. Laval, H7M5Y6 2175-Web designers and developers 1
511 Quebec High Wage Géoplus Inc. Laval, H7L5B7 2175-Web designers and developers 1
609 Quebec High Wage les produits de fenetres sol-r (2000) inc. montreal, H4N1H8 2175-Web designers and developers 1
668 Quebec High Wage Ossiaco Inc Montreal, H3C2G9 2175-Web designers and developers 1


错误处理

1
2
3
4
In [2]: pd.read_csv("/home/michael/Documents/TFWP_2020Q1_Positive_EN.csv")
[.....]
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 1: invalid continuation byte

  • 尝试使用encoding="latin"读取,如:pd.read_csv("/home/michael/Documents/TFWP_2020Q1_Positive_EN.csv",encoding="latin")

谢谢支持

  • 微信二维码: