DoH服务

链接：
使用DoH(Dns Over HTTPS)服务,可以处理一部分网站不可访问的问题。

1	~$ apt-get install dnscrypt-proxy -y

Open the file/etc/dnscrypt-proxy/dnscrypt-proxy.tomlin your favorite editor. Find the general section and change the server_namesvariable.

1	server_names = ['cloudflare']

And change the nameserver inside /etc/resolv.conf.

1
2
3

~$ cat /etc/resolv.conf
# Generated by dhcpcd
nameserver 127.0.0.1

NextCloud

简介
- Nextcloud offers the industry-leading, on-premises content collaboration platform.
  Our technology combines the convenience and ease of use of consumer-grade solutions
  like Dropbox and Google Drive with the security, privacy and control business needs.
链接:

`Docker-compose`搭建`nextcloud+collabora/code`(无HTTPS)

CODE Docker image
6 Docker Compose Best Practices for Dev and Prod

目录结构

nextcloud$ tree -L 2
.
├── db-data [error opening dir]
├── docker-compose.yml
└── volumes
    ├── config
    ├── custom_apps
    ├── data
    ├── html
    └── theme

7 directories, 1 file

docker-compose.yml

nextcloud$ cat docker-compose.yml
version: "3"
services:
  pg11:
    image: postgres:11-alpine
    environment:
      POSTGRES_DB: ${DB_NAME}
      POSTGRES_PASSWORD: ${DB_PASS}
      POSTGRES_USER: ${DB_USER}
    logging:
      driver: "none"
    restart: unless-stopped
    volumes:
      - ./db-data:/var/lib/postgresql/data
    env_file:
      - ./.env

  collabora:
    image: collabora/code:latest
    restart: always
    environment:
      - username=${ADM_USER}
      - password=${ADM_PASS}
      # 注意下面这一行参数很重要.
      - "extra_params=--o:ssl.enable=false --o:net.post_allow.host=192.168.1.100 --o:storage.wopi.host=192.168.1.100 --o:ssl.termination=false"
    ports:
      - "9080:9980"
    cap_add:
      - MKNOD
    env_file:
      - ./.env
    volumes:
      - /etc/hosts:/etc/hosts:ro

  nextcloud:
    image: nextcloud
    restart: unless-stopped
    depends_on:
      - pg11
    cap_add:
      - MKNOD
      - ALL
    volumes:
      - ./volumes/html:/var/www/html
      - ./volumes/config:/var/www/html/config
      - ./volumes/custom_apps:/var/www/html/custom_apps
      - ./volumes/data:/var/www/html/data
      - ./volumes/theme:/var/www/html/themes
      - /etc/localtime:/etc/localtime:ro
      - /etc/hosts:/etc/hosts:ro
    ports:
      - 8080:80

如果安装nextcloud是没有开启HTTPS访问的,就无法在http://cloud.domain.im/settings/apps页面里显示可安装的插件列表.这里也可以自己手动从https://apps.nextcloud.com下载插件包,手动解压到nextcloud/html/apps里,再启用它,就可以像是在线安装的那样使用了.并且如上面所示,需要在本机的/etc/hosts内加入一条如：192.168.1.100 cloud.domain.im这样的记录,并且把它挂载到容器中去,这样可以使用域名来测试,用域名来生成安全证书.

安装Collabora Online服务器功能

~$ wget -c https://github.com/nextcloud-releases/richdocuments/releases/download/v5.0.3/richdocuments-v5.0.3.tar.gz
~$ wget -c https://github.com/CollaboraOnline/richdocumentscode/releases/download/21.11.306/richdocumentscode.tar.gz
~$ cd volumes/html/apps
~$ tar xvf *.gz && sudo chown www-data:www-data -R richdocuments richdocumentscode

配置Collabora Online服务器的URL.
- 打开http://cloud.domain.im:8080/settings/admin/richdocuments页,选择Use your own server,这里填入http:://cloud.domain.im:9080后,按Save保存配置,如果上面的状态显示绿色的钩,并且显示Collabora Online server is reachable.表示已经连接到服务器了.
现在进去到网盘目录里,如果能打开docx,odt这样的文件,表示安装成功,Collaboar Online
打开http://cloud.domain.im:9080/browser/dist/admin/admin.html会有服务器相关的管理监控信息.网盘里当前被Collabora Online打开编辑文件,也会显示在Documents open的列表内.

`Self-Signed HTTPS`版本

Proxy settings
整个工程目录结构如下：

nextcloud$ tree -L 2
.
├── collabora
│   └── coolwsd.xml
├── create_self-sign-tls.sh
├── db-data [error opening dir]
├── docker-compose.yml
├── nginx
│   ├── certs
│   ├── nginx.conf
│   ├── sites-enabled
│   └── ssl
└── volumes
    ├── config
    ├── custom_apps
    ├── data
    ├── html
    └── theme

12 directories, 4 files

nextcloud$ tree nginx/
nginx/
├── certs
│   ├── cloud.domain.im.crt
│   └── cloud.domain.im.key
├── nginx.conf
├── sites-enabled
│   └── nextcloud.conf
└── ssl
    └── dhparam.pem

3 directories, 5 files

nginx全局主配置文件.

nextcloud$ cat nginx/nginx.conf
user  nginx;
worker_processes  1;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
  worker_connections  2048;
  multi_accept        on;
}
error_log syslog:server=unix:/dev/log,facility=local6,tag=nginx,severity=error;
http {
  log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"';
  access_log  /var/log/nginx/access.log  main;
  index index.html index.htm;
  charset utf-8;
  server_tokens off;
  autoindex off;
  client_max_body_size 512m;
  include       mime.types;
  default_type  application/octet-stream;
  sendfile        on;
  sendfile_max_chunk  51200k;
  tcp_nopush   on;
  tcp_nodelay  on;
  open_file_cache           max=1000 inactive=20s;
  open_file_cache_valid     30s;
  open_file_cache_min_uses  2;
  open_file_cache_errors    off;
  ssl_protocols             TLSv1 TLSv1.1 TLSv1.2;
  ssl_session_tickets off;
  ssl_session_cache         shared:SSL:50m;
  ssl_session_timeout       10m;
  ssl_stapling              off;
  ssl_stapling_verify       off;
  resolver                  8.8.8.8 8.8.4.4;  # replace with `127.0.0.1` if you have a local dns server
  ssl_prefer_server_ciphers on;
  ssl_dhparam               ssl/dhparam.pem;  # openssl dhparam -out ssl/dhparam.pem 4096
  gzip               on;
  gzip_disable       msie6;
  gzip_vary          on;
  gzip_proxied       any;
  gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
  include conf.d/*.conf;
  include sites-enabled/*.conf;
 }

nextcloud站点代理配置文件

nextcloud$ cat nginx/sites-enabled/nextcloud.conf
server {
    listen 80;
    listen 443 ssl http2;
    listen [::]:443 ssl http2;

    server_name cloud.domain.im;

    client_max_body_size 0;
    underscores_in_headers on;

    location / {
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        add_header Front-End-Https on;

        proxy_headers_hash_max_size 512;
        proxy_headers_hash_bucket_size 64;

        proxy_buffering off;
        proxy_redirect off;
        proxy_max_temp_file_size 0;
        proxy_pass http://cloud.domain.im:8880;
    }

    # static files
    location ^~ /browser {
        proxy_pass https://cloud.domain.im:9980;
        proxy_set_header Host $http_host;
    }

    # WOPI discovery URL
    location ^~ /hosting/discovery {
      proxy_pass https://cloud.domain.im:9980;
      proxy_set_header Host $http_host;
    }

    # Capabilities
    location ^~ /hosting/capabilities {
      proxy_pass https://cloud.domain.im:9980;
      proxy_set_header Host $http_host;
    }

    # main websocket
    location ~ ^/cool/(.*)/ws$ {
      proxy_pass https://cloud.domain.im:9980;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection "Upgrade";
      proxy_set_header Host $http_host;
      proxy_read_timeout 36000s;
    }

    # download, presentation and image upload
    location ~ ^/(c|l)ool {
      proxy_pass https://cloud.domain.im:9980;
      proxy_set_header Host $http_host;
    }

    # Admin Console websocket
    location ^~ /cool/adminws {
      proxy_pass https://cloud.domain.im:9980;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection "Upgrade";
      proxy_set_header Host $http_host;
      proxy_read_timeout 36000s;
    }

  ssl_certificate     /etc/nginx/certs/cloud.domain.im.crt;
  ssl_certificate_key /etc/nginx/certs/cloud.domain.im.key;
}

这里用一键openssl脚本,创建自签名的证书.

nextcloud$ cat create_self-sign-tls.sh
DOMAIN=$1
[[ -e nginx/ssl ]] && mkdir -pv nginx/ssl
DHP=./nginx/ssl/dhparam.pem
openssl dhparam -out $DHP 2048
[[ -e nginx/certs ]] && mkdir -pv nginx/certs
KEY=./nginx/certs/${DOMAIN}.key
CRT=./nginx/certs/${DOMAIN}.crt
openssl req -new -newkey rsa:4096 -days 3650 -nodes -x509 -subj "/C=US/ST=NC/L=Local/O=Dev/CN=${DOMAIN}" -keyout $KEY -out $CRT

一键创建自签名TLS证书脚本

#!/bin/bash

#
# If you using frp via IP address and not hostname, make sure to set the appropriate IP address in the Subject Alternative Name (SAN) area when generating SSL/TLS Certificates.

DAYS=5000
export DOMAIN=$1
if [ ! -d "$DOMAIN" ]; then
  echo "not domain name, default : example.com";
  DOMAIN="example.com"
fi
rm -rf $DOMAIN
[ ! -d $DOMAIN ] && mkdir  $DOMAIN && cd $DOMAIN

cat > my-openssl.cnf << EOF
[ ca ]
default_ca = CA_default
[ CA_default ]
x509_extensions = usr_cert
[ req ]
default_bits        = 2048
default_md          = sha256
default_keyfile     = privkey.pem
distinguished_name  = req_distinguished_name
attributes          = req_attributes
x509_extensions     = v3_ca
string_mask         = utf8only
prompt              = no
[ req_distinguished_name ]
C  = US
ST = VA
L  = SomeCity
O  = MyCompany
OU = MyDivision
CN = ${DOMAIN}
[ req_attributes ]
[ usr_cert ]
basicConstraints       = CA:FALSE
nsComment              = "OpenSSL Generated Certificate"
subjectKeyIdentifier   = hash
authorityKeyIdentifier = keyid,issuer
[ v3_ca ]
subjectKeyIdentifier   = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints       = CA:true
EOF


# build ca certificates

echo "---> build ca certificates"
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=${DOMAIN}" -days ${DAYS} -out ca.crt


# build frps server side certificates

mkdir server
echo "---> build frps server side certificates"
openssl genrsa -out server/server.key 2048

openssl req -new -sha256 -key server/server.key -out server/server.csr -config my-openssl.cnf

openssl x509 -req -days ${DAYS} \
	-in server/server.csr -CA ca.crt -CAkey ca.key -CAcreateserial \
  -extfile <(printf "subjectAltName=IP:127.0.0.1") \
  -out server/server.crt

# build frpc client side  certificates

echo "---> build frpc client side  certificates"
mkdir client
openssl genrsa -out client/client.key 2048
openssl req -new -sha256 -key client/client.key -out client/client.csr -config my-openssl.cnf

openssl x509 -req -days ${DAYS} \
    -in client/client.csr -CA ca.crt -CAkey ca.key -CAcreateserial \
    -extfile <(printf "subjectAltName=IP:127.0.0.1") \
    -out client/client.crt

cp ca.crt server/ca.crt
mv ca.crt client/ca.crt
rm ca.key ca.srl client/client.csr server/server.csr

echo "create Certificates done!!!!"
echo "verify the server Certificates"

cd server
openssl verify -CAfile ca.crt  server.crt
cd ../client
openssl verify -CAfile ca.crt  client.crt
chmod 644 client/client.key server/server.key

最终docker-compose的文件

nextcloud$ cat docker-compose.yml
version: "3"
services:
  nginx:
    image: nginx:latest
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf
      - ./nginx/ssl/dhparam.pem:/etc/nginx/ssl/dhparam.pem
      - ./nginx/certs:/etc/nginx/certs
      - ./nginx/sites-enabled:/etc/nginx/sites-enabled
      - /etc/hosts:/etc/hosts:ro
    ports:
      - 80:80
      - 443:443
    environment:
     - SITE_DOMAIN=cloud.domain.im
    depends_on:
     - nextcloud

  pg11:
    image: postgres:11-alpine
    environment:
      POSTGRES_DB: db3
      POSTGRES_PASSWORD: rocks
      POSTGRES_USER: cloudjs
    logging:
      driver: "none"
    restart: unless-stopped
    volumes:
      - ./db-data:/var/lib/postgresql/data

  collabora:
    image: collabora/code:latest
    restart: always
    environment:
      - username=admin
      - password=admin
      - cert_domain=cloud.domain.im
      - "extra_params=--o:ssl.enable=true --o:net.post_allow.host=192.168.1.100 --o:storage.wopi.host=192.168.1.100 --o:ssl.termination=true"
    ports:
      - "9980:9980"
    cap_add:
      - MKNOD
    volumes:
      - /etc/hosts:/etc/hosts:ro
      - ./collabora/coolwsd.xml:/etc/coolwsd/coolwsd.xml

  nextcloud:
    image: nextcloud
    restart: unless-stopped
    depends_on:
      - pg11
    cap_add:
      - MKNOD
    volumes:
      - ./volumes/html:/var/www/html
      - ./volumes/config:/var/www/html/config
      - ./volumes/custom_apps:/var/www/html/custom_apps
      - ./volumes/data:/var/www/html/data
      - ./volumes/theme:/var/www/html/themes
      - /etc/localtime:/etc/localtime:ro
      - /etc/hosts:/etc/hosts:ro
    ports:
      - 8880:80

如上面的文件所示,还需要修改collabora/code内默认的配置文件,这里是先从运行容器中复制一份源文件出来,修改后再用挂载的方式,覆盖容器内的配置.

1 2	nextcloud$ docker cp <container ID>:/etc/coolwsd/coolwsd.xml ./collabora/coolwsd.xml nextcloud$ chmod 755 ./collabora/coolwsd.xml

打开coolwsd.xml,并找到server_name这个节点,设置值为：cloud.domain.im. 具体如下：

nextcloud$ cat collabora/coolwsd.xml
[...]
<server_name desc="External hostname:port of the server running coolwsd. If empty, it's derived from the request (please set it if this doesn't work). May be specified when behind a reverse-proxy or when the hostname is not reachable directly." type="string" default="">cloud.domain.im</server_name>
[...]

在测试时发,如果没有上述coolwsd.xml的修改,就无法使Collabora Online打开文件编辑,日志中报出如下错误.

wsd-00001-00039 2022-04-22 15:52:33.479492 +0000 [ websrv_poll ] ERR  #29 Error while handling poll at 0 in websrv_poll: #29BIO error: 0, rc: -1: error:00000000:lib(0):func(0):reason(0):| net/Socket.cpp:467

因为是使用nginx做HTTPS代理,还需要对nextcloud/config/config.php修改,加入'overwriteprotocol' => 'https',,重定项跳转.

nextcloud$ sudo cat volumes/config/config.php
[...]
  'overwriteprotocol' => 'https',
  'datadirectory' => '/var/www/html/data',
  'dbtype' => 'pgsql',
  'version' => '23.0.3.2',
  'overwrite.cli.url' => 'http://cloud.domain.im',
  'dbname' => 'db3',
  'dbhost' => 'pg11',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'dbuser' => 'oc_lcy',
  'appstoreenabled' => true,
  'appstoreurl' => 'https://apps.nextcloud.com/api/v1',
  'installed' => true,
);

Docker安装

nextcloud/docker
这里是个人使用,所以对数据库没有什么特殊要求,就使用sqlite,但是安全方面还是要使用HTTPS,使用docker安装.

~$ docker images | grep "nextcloud" || docker pull nextcloud
~$ if [ ! -d nextcloud ]; then
    mkdir -pv nextcloud/{nextcloud,config,custom_apps,data,theme}
   fi
~$ cd nextcloud && docker run -d --restart=always --name nextcloud -p 8443:443  \
    -v nextcloud:/var/www/html \
    -v config:/var/www/html/config \
    -v custom_apps:/var/www/html/custom_apps \
    -v data:/var/www/html/data \
    -v theme:/var/www/html/themes \
    -v /etc/localtime:/etc/localtime:ro \
    nextcloud

下面是docker安装运行sqlite数据库，通过cloudflare tunnel对外提供访问，没有配置nginx之类的代理。

<?php
$CONFIG = array (
  'htaccess.RewriteBase' => '/',
  'memcache.local' => '\\OC\\Memcache\\APCu',
  'apps_paths' =>
  array (
    0 =>
    array (
      'path' => '/var/www/html/apps',
      'url' => '/apps',
      'writable' => false,
    ),
    1 =>
    array (
      'path' => '/var/www/html/custom_apps',
      'url' => '/custom_apps',
      'writable' => true,
    ),
  ),
  'instanceid' => 'xxxxxx',
  'passwordsalt' => 'xxxxxx',
  'secret' => 'xxxxxx',
  'trusted_domains' =>
  array (
    0 => '192.168.1.182',
    1 => 'mycloud.example.com',
  ),
  'datadirectory' => '/var/www/html/data',
  'dbtype' => 'sqlite3',
  'version' => '26.0.2.1',
  'overwrite.cli.url' => 'https://mycloud.example.com/',
  'overwritehost' => '',
  'overwriteprotocol' => 'https',
  'installed' => true,
);

需要配置的参数有以下个:
- trusted_domains:
- overwrite.cli.url: https://mycloud.example.com/
- overwritehost: ‘’,
- overwriteprotocol: ‘https’, 因为tunnel对外提供的访问是https.

Dokku安装

ngx_stream_ssl_preread_module

安装`Plugins`

1 2	~$ sudo dokku plugin:install https://github.com/dokku/dokku-letsencrypt.git ~$ sudo dokku plugin:install https://github.com/dokku/dokku-postgres.git postgres

安装

~$ dokku apps:create mycloud
~$ docker pull nextcloud
~$ docker images
REPOSITORY             TAG       IMAGE ID       CREATED         SIZE
nextcloud              latest    7aa569922593   11 hours ago    835MB

# Note: The image must be retagged `dokku/<app-name>:<version>`
~$ docker tag nextcloud:latest dokku/mycloud:latest

~$ sudo mkdir -p /var/lib/dokku/data/storage/mycloud
~$ sudo chown -R dokku:dokku /var/lib/dokku/data/storage/mycloud
~$ dokku storage:mount mycloud /var/lib/dokku/data/storage/mycloud:/var/www/html
~$ dokku tags:deploy mycloud latest

链接数据库,运行下面操作后,打开浏览器,进入首次配置管理员与数据库类型(Storage & database)的配置,默认是SQLite.
1
2
~$ dokku postgres:create mycloud_db
~$ dokku postgres:link mycloud_db mycloud

设置域名,域名是主要是为了使用Let's Encrypt,还有就是SNI的功能,这里没有注册域名,使用了dynv6.net的动态域名.

~$ dokku domains:add mycloud  nc.llccyy.dynv6.net
~$ dokku domains:remove mycloud mycloud.localhost
~$ dokku config:set mycloud --no-restart DOKKU_LETSENCRYPT_EMAIL=yjdwbj@gmail.com
~$ dokku letsencrypt mycloud

获取证书错误:

darkhttpd/1.12, copyright (c) 2003-2016 Emil Mikulic.
listening on: http://0.0.0.0:80/
2021-02-28 08:15:08,915:INFO:__main__:1406: Generating new certificate private key
2021-02-28 08:15:32,772:ERROR:__main__:1388: CA marked some of the authorizations as invalid, which likely means it could not access http://example.com/.well-known/acme-challenge/X. Did you set correct path in -d example.com:path or --default_root? Are all your domains accessible from the internet? Please check your domains' DNS entries, your host's network/firewall setup and your webserver config. If a domain's DNS entry has both A and AAAA fields set up, some CAs such as Let's Encrypt will perform the challenge validation over IPv6. If your DNS provider does not answer correctly to CAA records request, Let's Encrypt won't issue a certificate for your domain (see https://letsencrypt.org/docs/caa/). Failing authorizations: https://acme-v02.api.letsencrypt.org/acme/authz-v3/11204345532
Challenge validation has failed, see error log.

Debugging tips: -v improves output verbosity. Help is available under --help.
-----> Certificate retrieval failed!
-----> Disabling ACME proxy for fpm...
       done

出现上面的问题,基本就是DNS A记录的问题,不能解析域名,也有可能是域名服务商的问题,在使用dynv6.com的服务是有碰到这种问题,有时不作任改动,过一段时间重试又可以了.也可以尝试使用其它的DDNS.
在dokku中使用Let's Encrypt获取证书,有几点要求注意的点:
- 必须要有一个域名(有A或AAAA记录),二级动态域名也可.
- DOKKU_DOCKERFILE_PORTS: 80/tcp,必须是80端口.如:nextcloud:latest是80/tcp,nextcloud:nextcloud:fpm-alpine它是9000/tcp,而且nextcloud:fpm-alpine only a php fpm instance without a web server.所以,使用nextcloud:latest创建的应用是可以正确的获取证书.
- 如果DOKKU_DOCKERFILE_PORTS不是80端口,可以使用dokku proxy:ports-set <APP> http:80:9000先设置要代理,再申请获取证书.
如果是从Docker Images去创建的app,也就是用dokku tags:deploy <appname> latest部署的,可以使用下面docker命令查看它所支持的资源
与Exposed的详情.

1	~$ docker inspect <appname>.web.1 \| jq ".[0].Config.ExposedPorts"

或者直接查看镜像的配置详情

1	~$ docker image inspect <image tag> \| jq '.[0].Config.ExposedPorts'

设置上传文件限制
nginx configuration

1 2	~$ dokku nginx:set mycloud client-max-body-size 100m ~$ dokku nginx:show-config mycloud

影音视频

`jellyfin`媒体中心

1
2
3

~$ docker pull jellyfin/jellyfin
~$ docker run  -d  -p 8096:8096 --name jellyfin   -v `pwd`/jellyfin/config:/config  -v `pwd`/jellyfin/cache:/cache  -v `pwd`/Incoming:/media  --restart=unless-stopped docker.io/jellyfin/jellyfin:latest

KODI(XMBC)

kodi
kodi非常强支持非常的多的系统平台,如果是小米电视2，因为它是深度定制的android 4.3,所以最高版本只能安装kodi-16.1-Jarvis-armeabi-v7a.apk,这版本，很多内置的Add-ons已经失效，且有很多的插件也是失效，比如: jellyfin-kodi的插件就是无法使用，但是用它与minidlna配合还是很好的，只少比SMBFS的体验要好。

本地收藏夹服务

sissbruecker/linkding

PDF转文本OCR

poppler-utils

下面是以一个日文说明书为例,pdftotext无法转码，显示乱码，pdftohtml显示版权问题。

~$ sudo apt-get install poppler-utils
~$ pdftotext  ~/Downloads/SDFA.pdf target.txt
~$ pdftohtml  ~/Downloads/SDFA.pdf target.html
Permission Error: Copying of text from this document is not allowed.

ocrmypdf

ocrmypdf也是不能转换加密后的PDF

~$ sudo apt-get install ocrmypdf
~$ ocrmypdf ~/Downloads/SDFA.pdf  -l jpn test.txt
EncryptedPdfError: Input PDF is encrypted. The encryption must be removed to
perform OCR.

For information about this PDF\'s security use
    qpdf --show-encryption infilename

You can remove the encryption using
    qpdf --decrypt [--password=[password]] infilename

Tesseract

~$ dpkg -l | grep "tesseract"
ii  libtesseract-dev:amd64                                      4.1.1-2.1                                                 amd64        Development files for the tesseract command line OCR tool
ii  libtesseract4:amd64                                         4.1.1-2.1                                                 amd64        Tesseract OCR library
ii  tesseract-ocr                                               4.1.1-2.1                                                 amd64        Tesseract command line OCR tool
ii  tesseract-ocr-chi-sim                                       1:4.00~git30-7274cfa-1.1                                  all          tesseract-ocr language files for Chinese - Simplified
ii  tesseract-ocr-cym                                           1:4.00~git30-7274cfa-1.1                                  all          tesseract-ocr language files for Welsh
ii  tesseract-ocr-dev                                           3.04.01-5                                                 all          transitional dummy package
ii  tesseract-ocr-eng                                           1:4.00~git30-7274cfa-1.1                                  all          tesseract-ocr language files for English
ii  tesseract-ocr-equ                                           3.04.00-1                                                 all          tesseract-ocr language files for equations
ii  tesseract-ocr-jpn                                           1:4.00~git30-7274cfa-1.1                                  all          tesseract-ocr language files for Japanese
ii  tesseract-ocr-osd                                           1:4.00~git30-7274cfa-1.1                                  all          tesseract-ocr language files for script and orientation

Tesseract不支持对PDF文件进行识别

~$ tesseract ~/Downloads/SDFA.pdf ttt --dpi 150
Tesseract Open Source OCR Engine v4.1.1 with Leptonica
Error in pixReadStream: Pdf reading is not supported
Error in pixRead: pix not read
Error during processing.

先把它转成一张张png,

1
2
3

~$ pdftoppm -png ~/Downloads/SDFA.pdf turing
~$ ls turing-*.png
turing-1.png  turing-2.png  turing-3.png  turing-4.png  turing-5.png

安装目标语言包

~$ tesseract  --list-langs
List of available languages (5):
chi_sim
cym
eng
jpn
osd

~$ tesseract turing-2.png turing -l jpn --dpi 150

输出turing.txt的文本文件。

Frog

TenderOwl/Frog/Extract text from any image, video, QR Code and etc.

离线wiki

链接：

Kiwix

Kiwix
源码编译

1	~$ sudo apt-get install libxapian-dev libpugixml-dev

kiwix/libkiwix

Mustache

1	~$ git clone https://github.com/kainjow/Mustache

先安装libkiwix

~$ git clone https://github.com/kiwix/libkiwix
~$ cd libkiwix
~$ cat >build.patch<<EOF
index fca77ec..5111206 100644
--- a/meson.build
+++ b/meson.build
@@ -28,9 +28,9 @@ zlib_dep = dependency('zlib', static:static_deps)
 xapian_dep = dependency('xapian-core', static:static_deps)

 if compiler.has_header('mustache.hpp')
-  extra_include = []
-elif compiler.has_header('mustache.hpp', args: '-I/usr/include/kainjow')
-  extra_include = ['/usr/include/kainjow']
+  extra_include = ['/home/michael/3TB-DISK/github/kiwix/Mustache']
+elif compiler.has_header('mustache.hpp', args: '-I/home/michael/3TB-DISK/github/kiwix/Mustache')
+  extra_include = ['/home/michael/3TB-DISK/github/kiwix/Mustache']
 else
   error('Cannot found header mustache.hpp')
 endif
EOF
~$ meson . build && ninja -C build install

[openzim/libzim](git clone https://github.com/openzim/libzim)

1
2
3

~$ git clone https://github.com/openzim/libzim
~$ cd libzim
~$ meson . build && ninja -C build install

kiwix/kiwix-desktop

1
2
3

~$ git clone https://github.com/kiwix/kiwix-desktop
~$ cd kiwix-desktop
~$ qmake && make -j10 install

直接编译成deb安装包

~$ cat > rules.patch <<EOF
index f023663..f4f2877 100755
--- a/debian/rules
+++ b/debian/rules
@@ -4,3 +4,7 @@ export DEB_BUILD_MAINT_OPTIONS = hardening=+all

 %:
        dh $@
+
+override_dh_shlibdeps:
+       dh_shlibdeps --dpkg-shlibdeps-params=--ignore-missing-info

EOF
~ kiwix-desktop$ git apply rules.path
~ kiwix-desktop$ dpkg-buildpackage -us -uc

搭建`Matrix`服务

matrix 菜鳥使用心得
使用docker-compose搭建一个本地matrix服务做测试

1
2
3

~$ mkdir matrix
~$ docker network create --driver=bridge --subnet=10.10.10.0/24 --gateway=10.10.10.1 matrix_net
~$ cd matrix

如果只是测试,或者VPS资源有限,使用sqlite3就可以，注释掉postgres的一项。
docker-compose.yaml

version: '3.8'
services:
  postgres:
    image: postgres:11-alpine
    restart: unless-stopped
    networks:
      default:
        ipv4_address: 10.10.10.2
    volumes:
     - ./postgresdata:/var/lib/postgresql/data

    # These will be used in homeserver.yaml later on
    environment:
     - POSTGRES_DB=synapse
     - POSTGRES_USER=synapse
     - POSTGRES_PASSWORD=STRONGPASSWORD
     - POSTGRES_INITDB_ARGS=--lc-collate C --lc-ctype C --encoding UTF8

  element:
    image: vectorim/element-web:latest
    restart: unless-stopped
    volumes:
      - ./element-config.json:/app/config.json
    networks:
      default:
        ipv4_address: 10.10.10.3

  synapse:
    image: matrixdotorg/synapse:latest
    restart: unless-stopped
    networks:
      default:
        ipv4_address: 10.10.10.4
    volumes:
     - ./synapse:/data

networks:
  default:
    external:
      name: matrix_net

下载element的模版本配置文件,并且删除"default_server_name": "matrix.org"这一行。

1 2	matrix$ wget https://develop.element.io/config.json matrix$ mv config.json element-config.json

生成`Synapse`配置文件

matrix$ mkdir synapse
matrix$ docker run -it --rm \
    -v "$(PWD)/synapse:/data" \
    -e SYNAPSE_SERVER_NAME=matrix.example.com \
    -e SYNAPSE_REPORT_STATS=yes \
    matrixdotorg/synapse:latest generate

Synapse默认是使用sqlite3的,如果是要使用postgres,需要把synapse/homeserver.yaml里的database设置如下：

database:
  name: psycopg2
  args:
    user: synapse
    password: STRONGPASSWORD
    database: synapse
    host: postgres
    cp_min: 5
    cp_max: 10

创建新的用户

matrix$ docker-compose up -d

matrix$ docker exec -it matrix_synapse_1 bash
~# register_new_matrix_user -c /data/homeserver.yaml http://localhost:8008
New user localpart [root]: ruan
Password:
Confirm password:
Make admin [no]: yes
Sending registration request...
Success!

`caddy V2`代理

Getting Started

$ apt install -y debian-keyring debian-archive-keyring apt-transport-https
$ curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo tee /etc/apt/trusted.gpg.d/caddy-stable.asc
$ curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list
$ apt update
$ apt install caddy -y

matrix$ cat Caddyfile
http://matrix.example.com {
  tls internal
  reverse_proxy /_matrix/* 10.10.10.4:8008
  reverse_proxy /_synapse/client/* 10.10.10.4:8008
  log {
    output file /var/log/caddy/matrix.example.log
  }
  header {
    X-Content-Type-Options nosniff
    Referrer-Policy  strict-origin-when-cross-origin
    Strict-Transport-Security "max-age=63072000; includeSubDomains;"
    Permissions-Policy "accelerometer=(), camera=(), geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=(), interest-cohort=()"
    X-Frame-Options SAMEORIGIN
    X-XSS-Protection 1
    X-Robots-Tag none
    -server
  }
}

http://element.example.com {
  tls internal
  encode zstd gzip
  reverse_proxy 10.10.10.3:80

  log {
    output file /var/log/caddy/element.example.log
  }
  header {
    X-Content-Type-Options nosniff
    Referrer-Policy  strict-origin-when-cross-origin
    Strict-Transport-Security "max-age=63072000; includeSubDomains;"
    Permissions-Policy "accelerometer=(), camera=(), geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=(), interest-cohort=()"
    X-Frame-Options SAMEORIGIN
    X-XSS-Protection 1
    X-Robots-Tag none
    -server
  }
}

加入到系统的配置运行

matrix$ caddy adapt --config /path/path/Caddyfile
matrix$ caddy fmt
matrix$ caddy reload

注意：Caddyfile内如果只写 <domain.com> {} 的格式,它就会自动转换http -> https,这对于内网测试，本地浏览器测试会出现Mixed content blocked这样的错误.

移动APP的连接

进入网页端的管理后台https://<domain>/settings/user/security. 在Devices & sessions下面点击Create new app password生成一个动态的用户与密码,再点Show QR code from mobile apps,打开移动端nextcloud选择连接自建服务器,扫描连接.
如果碰到问题,可以通过查看https://<domain>/settings/admin/logging.

MinIO

minio
http://www.minio.org.cn/
Build a PI Cluster for Local Development - Part 1
MinIO是在GNU Affero通用公共许可证v3.0下发布的高性能对象存储.它是与Amazon S3云存储服务兼容的API.使用MinIO 为机器学习、分析和应用
程序数据工作负载构建高性能基础架构.

容器运行

~$ sudo apt-get install podman -y
~$ podman run \
  -p 9000:9000 \
  -p 9001:9001 \
  minio/minio server /data --console-address ":9001"

如果出现下面的错误,请确认/etc/containers/registries.conf里有这一行unqualified-search-registries=["docker.io"],再重试.

1	Error: error getting default registries to try: short-name "minio/minio" did not resolve to an alias and no unqualified-search registries are defined in "/etc/containers/registries.conf"

也可以docker运行

~$ docker run -it --rm -p 9000:9000 \
  -v `pwd`/minio-data:/data \
  -e MINIO_ROOT_USER=minio \
  -e MINIO_ROOT_PASSWORD=minio123 \
  -p 9001:9001 minio/minio server /data --console-address ":9001"

直接单机运行

1
2
3

~$ wget http://dl.minio.org.cn/server/minio/release/darwin-amd64/minio
~$ chmod +x minio
~$ ./minio server /data

客户端访问

mc

1
2
3

~$ ./mc alias set myminio http://127.0.0.1:9000 minio minio123
Added `myminio` successfully.

添加用户

~$ ./mc admin user add myminio testuser testpwd123
Added user `testuser` successfully.
~$ ./mc admin user info myminio testuser
AccessKey: testuser
Status: enabled
PolicyName:
MemberOf:

添加桶(bucket)

1 2	./mc mb myminio/test-new-s3-bucket Bucket created successfully `myminio/test-new-s3-bucket`.

创建一个bucket policy

~$ cat test-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Action": [
          "s3:GetObject",
          "s3:PutObject",
          "s3:DeleteObject",
          "s3:GetBucketLocation",
          "s3:ListBucket",
          "s3:ListAllMyBuckets"
        ],
        "Effect": "Allow",
        "Resource": [
          "arn:aws:s3:::<change-this-to-your-bucket-name>/*"
        ],
        "Sid": "Public"
      }
    ]
}

把策略文件加入到服务器中

1
2
3

~$ ./mc admin policy add myminio test-policy test-policy.json
Added policy `test-policy` successfully.

应用策略到指定的用户上

~$ ./mc admin policy set myminio "test-policy" user=testuser
Policy `test-policy` is set on user `testuser`

~$ ./mc admin user info myminio testuser
AccessKey: testuser
Status: enabled
PolicyName: test-policy
MemberOf:

awscli

1	~$ pip3 install awscli

~$ aws configure --profile minio
AWS Access Key ID [None]: minio
AWS Secret Access Key [None]: minio123
Default region name [None]: myminio
Default output format [None]: json

创建一个bucket.

1	~$ aws --profile minio --endpoint-url http://127.0.0.1:9000 s3 mb s3://new-s3-bucket

列出服务器上的bucket.

1
2
3

~$ aws --profile minio --endpoint-url http://127.0.0.1:9000 s3 ls
2021-12-13 23:16:20 new-s3-bucket
2021-12-13 22:58:15 test-new-s3-bucket

上传一个文件

1 2	~$ aws --profile minio --endpoint-url http://127.0.0.1:9000 s3 cp test-policy.json s3://new-s3-bucket upload: ./test-policy.json to s3://new-s3-bucket/test-policy.json

再对比一下,绑定到minio容器的本地目录minio-data的结构.

minio$ tree minio-data/
minio-data/
├── new-s3-bucket
│   └── test-policy.json
└── test-new-s3-bucket

2 directories, 1 file

Kubernetes环境中,使用MinIO Kubernetes Operator来创建/配置/管理k8s的MinIO集群.

Syncthing

https://github.com/syncthing/syncthing

1 2	~$ docker pull syncthing/syncthing ~$ docker run -p 8384:8384 -p 22000:22000 -v <YOUR PC FOLDER>/share:/var/syncthing syncthing/syncthing:latest

使用`docker-compose`创建

这里使用docker-compose创建,并支持Traefik反向代理,暴露给外部访问,这里只是本地内部测试,未配置https与真实的域名.

syncthing$ cat .env
# Syncthing
DOCKER_SYNCTHING_IMAGE_NAME=syncthing/syncthing
DOCKER_SYNCTHING_HOSTNAME=syncthing-on-storage
DOCKER_SYNCTHING_DOMAIN=syncthing.localhost

# discosrv
DOCKER_DISCOSRV_IMAGE_NAME=syncthing/discosrv
DOCKER_DISCOSRV_HOSTNAME=discosrv-on-storage
DOCKER_DISCOSRV_DOMAIN=discosrv.localhost

# exporter
DOCKER_EXPORTER_IMAGE_NAME=soulteary/syncthing-exporter
# xxd -l 16 -p /dev/random | base64
DOCKER_EXPORTER_API_TOKEN=OTU0NGJmMGJhYzRiNGEzM2Q3Yzc4MjhjOTdhZjJkMDAK
DOCKER_EXPORTER_HOSTNAME=syncthing-exporter-on-storage
DOCKER_EXPORTER_DOMAIN=syncthing-exporter.localhost

docker-compose.yml文件,需要当前目录里的.env文件配合.

syncthing$ cat docker-compose.yml

version: "3"

services:
  syncthing:
    image: ${DOCKER_SYNCTHING_IMAGE_NAME}
    container_name: ${DOCKER_SYNCTHING_HOSTNAME}
    hostname: ${DOCKER_SYNCTHING_HOSTNAME}
    environment:
      - PUID=1000
      - PGID=1000
    volumes:
      - ./data:/var/syncthing
    ports:
      - "22000:22000"
    restart: always
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=traefik_default"
      - "traefik.http.routers.sync-http.entrypoints=http"
      - "traefik.http.routers.sync-http.rule=Host(`${DOCKER_SYNCTHING_DOMAIN}`)"
      - "traefik.http.routers.sync-http.service=sync-backend"
      - "traefik.http.services.sync-backend.loadbalancer.server.scheme=http"
      - "traefik.http.services.sync-backend.loadbalancer.server.port=8384"
    networks:
      - traefik_default
    logging:
      driver: "json-file"
      options:
        max-size: "1m"

networks:
  traefik_default:
    external: true

通过浏览器打开http://127.0.0.1:8384访问控制台,移动端可以安装syncthing客户端进行连接.

源码编译，安装`systemd`服务运行

~$ git clone https://github.com/syncthing/syncthing
~$ cd syncthing && ./build.sh


~$ sudo cp bin/syncthing /usr/bin/
~$ sudo cp etc/linux-systemd/user/syncthing.service /etc/systemd/user/
~$ systemctl --user --now enable syncthing
~$ systemctl start syncthing

使用`docker-compose`安装`mysql+phpadmin`

version: '3'

services:
  db:
    image: mysql
    container_name: zlib-db
    environment:
      MYSQL_ROOT_PASSWORD: 123456
      MYSQL_DATABASE: zlib-db
      MYSQL_USER: zlib
      MYSQL_PASSWORD: zlib123
    ports:
      - "3306:3306"
    volumes:
      - ./db-data:/var/lib/mysql
  phpmyadmin:
    image: phpmyadmin
    container_name: pma
    links:
      - db
    environment:
      PMA_HOST: zlib-db
      PMA_PORT: 3306
      PMA_ARBITRARY: 1
    restart: always
    ports:
      - 8081:80
volumes:
  dbdata:

协作文档

CryptPad

xwiki-labs/cryptpad

~$ sudo dokku apps:create cryptpad
~$ docker tag promasu/cryptpad:latest dokku/cryptpad:latest
~$ sudo dokku tags:deploy cryptpad latest
~$ sudo dokku domains:add cryptpad cryptpad.llccyy.dynv6.net

wiki.js

`docker-compose`安装

wiki.js$ cat docker-compose.yml
version: "3"
services:

  db:
    image: postgres:11-alpine
    environment:
      POSTGRES_DB: wiki
      POSTGRES_PASSWORD: wikijsrocks
      POSTGRES_USER: wikijs
    logging:
      driver: "none"
    restart: unless-stopped
    networks:
      - traefik_default
    volumes:
      - ./db-data:/var/lib/postgresql/data

  wiki:
    image: requarks/wiki:2
    depends_on:
      - db
    networks:
      - traefik_default
    environment:
      DB_TYPE: postgres
      DB_HOST: db
      DB_PORT: 5432
      DB_USER: wikijs
      DB_PASS: wikijsrocks
      DB_NAME: wiki
    restart: unless-stopped
    expose:
      - 3000
    labels:
      - traefik.enable=true
      - traefik.docker.network=traefik_default
      - traefik.http.routers.wiki.rule=Host(`wiki.localhost`)
      - traefik.http.routers.wiki.entrypoints=http
      - traefik.http.services.wiki.loadbalancer.server.port=3000

networks:
  traefik_default:
    external: true

如上面如示,docker-compose.yml是开启了Traefik反向代理的,启动后,打开http://wiki.localhost/就可以设置安装wiki.js向导页了.

配置`TLS`与域名

cat docker-compose-wikijs.yml
version: "3"
services:

  db:
    image: postgres:11-alpine
    environment:
      POSTGRES_DB: wiki
      POSTGRES_PASSWORD: wikijsrocks
      POSTGRES_USER: wikijs
    logging:
      driver: "none"
    restart: unless-stopped
    networks:
      - traefik_default
    volumes:
      - ./db-data:/var/lib/postgresql/data

  wiki:
    image: requarks/wiki:2
    depends_on:
      - db
    networks:
      - traefik_default
    environment:
      DB_TYPE: postgres
      DB_HOST: db
      DB_PORT: 5432
      DB_USER: wikijs
      DB_PASS: wikijsrocks
      DB_NAME: wiki
    restart: unless-stopped
    expose:
      - 3000
    labels:
      - traefik.enable=true
      - traefik.docker.network=traefik_default
      - traefik.http.routers.wiki.rule=Host(`<your full domain name>`) && (PathPrefix(`/wiki`) || PathPrefix(`/_assets`))
      - traefik.http.routers.wiki.entrypoints=websecure
      - traefik.http.routers.wiki.tls.certresolver=myresolver
      - traefik.http.services.wiki.loadbalancer.server.port=3000

networks:
  traefik_default:
    external: true

如上面所示,entrypoints=websecure,tls.certresolver=myresolver这两个指项的定义是在traefik启动命令行中定义的,而且路由规则Rule必须是匹配/wiki与/_assets的前缀.traefix的启动命令大概如下：

[...]
traefik:
    image: "traefik:v2.5"
    container_name: "traefik"
    command:
      - "--api.insecure=false"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--providers.file.directory=/letsencrypt/"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.myresolver.acme.httpchallenge=true"
      - "--certificatesresolvers.myresolver.acme.httpchallenge.entrypoint=web"
      #- "--certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
      - "--certificatesresolvers.myresolver.acme.email=<your email>@gmail.com"
      - "--certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json"
[...]

第一次启动访问wiki页面时,会是一个初始化安装页,在SITE URL必须填写https://<your full domain>/wiki.

LoRaWAN 网关

ChirpStack

系统应用安装方式

Quickstart Debian / Ubuntu
需要先安装配置mqtt,sql,redis等基础软件服务。设置ChirpStack仓库

~$ sudo apt install apt-transport-https dirmngr
~$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 1CE2AFD36DBCCA00
~$ sudo echo "deb https://artifacts.chirpstack.io/packages/4.x/deb stable main" | sudo tee /etc/apt/sources.list.d/chirpstack.list
~$ sudo apt update

先安装基础服务

~$ sudo apt install \
    mosquitto \
    mosquitto-clients \
    redis-server \
    redis-tools \
    postgresql

设置postgresql,创建如下的库。

~$ sudo -u postgres psql

-- create role for authentication
create role chirpstack with login password 'chirpstack';

-- create database
create database chirpstack with owner chirpstack;

-- change to chirpstack database
\c chirpstack

-- create pg_trgm extension
create extension pg_trgm;

-- exit psql
\q

安装`ChirpStack Gateway Bridge`.

1	~$ sudo apt install chirpstack-gateway-bridge

这里以EU868为例，需要修改/etc/chirpstack-gateway-bridge/chirpstack-gateway-bridge.toml如下配置：

[...]
[integration.mqtt]
  # Event topic template.
  event_topic_template="eu868/gateway/{{ .GatewayID }}/event/{{ .EventType }}"

  # Command topic template.
  command_topic_template="eu868/gateway/{{ .GatewayID }}/command/#"
[...]

开机启动服务

# start chirpstack-gateway-bridge
sudo systemctl start chirpstack-gateway-bridge

# start chirpstack-gateway-bridge on boot
sudo systemctl enable chirpstack-gateway-bridge

安装`ChirpStack`

1	sudo apt install chirpstack

这里需要修改/etc/chirpstack/chirpstack.toml里的[enabled_regions]把不必要的区域参数删除或都注释掉。使用openssl rand -base64 32生成一串随机字符串，替换secret="you-must-replace-this"这一行。
并且需要开启后台的依赖服务:postgres,redis,mosquitto, 还需要设置好各后台服务器登录验证。才能正常启动chirpstack-gateway-bridge与chirpstack服务。
最后，需要删改增补对应的eu868的区域参数，是在/etc/chirpstack/region_eu868.toml文件内.
配置不同区域的参数与频道,需要参考这里https://github.com/chirpstack/chirpstack/tree/master/chirpstack/configuration.
设置开机运行服务

1 2	sudo systemctl start chirpstack sudo systemctl enable chirpstack

在浏览器打开本机的ip:http://chirpstack-ip:8080出现登录界面，默认的用户密码是:admin,admin.

Docker安装方式

Quickstart Docker Compose
使用Docker安装方式有很多优点，配置一次，可以复制到其它机器上快速大批量部署，并且不会污染到主机的系统环境。这里直接下载https://github.com/chirpstack/chirpstack-docker即可快速部署了。

1 2	~$ git clone https://github.com/chirpstack/chirpstack-docker.git cd chirpstack-docker

Cloning the device repository，这里定义一些硬件模本文件。

1 2	~$ git clone https://github.com/brocaar/lorawan-devices /opt/lorawan-devices

运行docker-compose up后，

1 2	~$ docker-compose up -d

成功运行后，docker ps如下：

panther@panther-x2:~/chirpstack-docker$ docker ps
CONTAINER ID   IMAGE                                    COMMAND                  CREATED          STATUS          PORTS                                       NAMES
03e7d4c17b6b   chirpstack/chirpstack-rest-api:4         "/usr/bin/chirpstack…"   13 minutes ago   Up 11 minutes   0.0.0.0:8090->8090/tcp, :::8090->8090/tcp   chirpstack-docker_chirpstack-rest-api_1
43daf00ef2b6   chirpstack/chirpstack:4                  "/usr/bin/chirpstack…"   13 minutes ago   Up 13 minutes   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp   chirpstack-docker_chirpstack_1
a68509e131f9   chirpstack/chirpstack-gateway-bridge:4   "/usr/bin/chirpstack…"   13 minutes ago   Up 11 minutes   0.0.0.0:3001->3001/tcp, :::3001->3001/tcp   chirpstack-docker_chirpstack-gateway-bridge-basicstation_1
3f4c2b6bf481   chirpstack/chirpstack-gateway-bridge:4   "/usr/bin/chirpstack…"   13 minutes ago   Up 11 minutes   0.0.0.0:1700->1700/udp, :::1700->1700/udp   chirpstack-docker_chirpstack-gateway-bridge_1
25645bb601dd   eclipse-mosquitto:2                      "/docker-entrypoint.…"   13 minutes ago   Up 13 minutes   0.0.0.0:1883->1883/tcp, :::1883->1883/tcp   chirpstack-docker_mosquitto_1
9c7c3b6c7745   postgres:14-alpine                       "docker-entrypoint.s…"   13 minutes ago   Up 13 minutes   5432/tcp                                    chirpstack-docker_postgres_1
6974cb36c624   redis:7-alpine                           "docker-entrypoint.s…"   13 minutes ago   Up 13 minutes   6379/tcp                                    chirpstack-docker_redis_1

需导入上面的硬件列表。

1	~$ docker exec chirpstack-docker_chirpstack_1 chirpstack -c /etc/chirpstack import-legacy-lorawan-devices-repository -d /opt/lorawan-devices

导入成功后，会以在ChirpStack -> Network Server -> Device Profile Templates里看到各种硬件列表。

注册`LoraWAN gateway`

AI/ML

语音控制

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

MycroftAI/mimic3

A fast and local neural text to speech system developed by Mycroft for the Mark II.

(MycroftAI/mycroft-core)[https://github.com/MycroftAI/mycroft-core]

Mycroft is a hackable open source voice assistant.

通信类

Matrix

服务端

Dendrite

dendrite
INSTALL.md
这里主要根据官方的文档,通过docker快速部署一个服务实践.

1	~$ git clone https://github.com/matrix-org/dendrite

客户端

项目管理

OpenProject

openproject

自动化测试

RobotFramework

robotframework

Trojan

trojan与dokku共用 443端口,4层转发.

~$ nginx  -T
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
# configuration file /etc/nginx/nginx.conf:
user www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;

events {
	worker_connections 768;
	# multi_accept on;
}

http {

	##
	# Basic Settings
	##

	sendfile on;
	tcp_nopush on;
	tcp_nodelay on;
	keepalive_timeout 65;
	types_hash_max_size 2048;
	# server_tokens off;

	# server_names_hash_bucket_size 64;
	# server_name_in_redirect off;

	include /etc/nginx/mime.types;
	default_type application/octet-stream;

	##
	# SSL Settings
	##

	ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
	ssl_prefer_server_ciphers on;

	##
	# Logging Settings
	##

	access_log /var/log/nginx/access.log;
	error_log /var/log/nginx/error.log;

	##
	# Gzip Settings
	##

	gzip on;

	# gzip_vary on;
	# gzip_proxied any;
	# gzip_comp_level 6;
	# gzip_buffers 16 8k;
	# gzip_http_version 1.1;
	# gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

	##
	# Virtual Host Configs
	##

	include /etc/nginx/conf.d/*.conf;
	include /etc/nginx/sites-enabled/*;
}


#mail {
#	# See sample authentication script at:
#	# http://wiki.nginx.org/ImapAuthenticateWithApachePhpScript
#
#	# auth_http localhost/auth.php;
#	# pop3_capabilities "TOP" "USER";
#	# imap_capabilities "IMAP4rev1" "UIDPLUS";
#
#	server {
#		listen     localhost:110;
#		protocol   pop3;
#		proxy      on;
#	}
#
#	server {
#		listen     localhost:143;
#		protocol   imap;
#		proxy      on;
#	}
#}

# https://raymii.org/s/tutorials/nginx_1.15.2_ssl_preread_protocol_multiplex_https_and_ssh_on_the_same_port.html
stream {
    map $ssl_preread_server_name $backend_name {
	proxy.yjdwbj.cloudns.org trojan;
	nc.llccyy.dynv6.net mycloud;
	fpm.yjdwbj.cloudns.org fpm;
        default dokku;
    }
    upstream dokku {
	server 127.0.0.1:2000;
    }
	upstream fpm
	{
		server 127.0.0.1:6443;
	}

    upstream trojan {
        server 172.17.0.2:443;
    }

	upstream mycloud {

	  server 127.0.0.1:5443;
	}


    server {
        listen 443 reuseport ;
        listen [::]:443 reuseport;
        proxy_pass  $backend_name;
	#proxy_protocol on;
        ssl_preread on;
    }
}

# configuration file /etc/nginx/modules-enabled/50-mod-http-auth-pam.conf:
load_module modules/ngx_http_auth_pam_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-dav-ext.conf:
load_module modules/ngx_http_dav_ext_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-echo.conf:
load_module modules/ngx_http_echo_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-geoip.conf:
load_module modules/ngx_http_geoip_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-image-filter.conf:
load_module modules/ngx_http_image_filter_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-subs-filter.conf:
load_module modules/ngx_http_subs_filter_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-upstream-fair.conf:
load_module modules/ngx_http_upstream_fair_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-http-xslt-filter.conf:
load_module modules/ngx_http_xslt_filter_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-mail.conf:
load_module modules/ngx_mail_module.so;

# configuration file /etc/nginx/modules-enabled/50-mod-stream.conf:
load_module modules/ngx_stream_module.so;

# configuration file /etc/nginx/mime.types:

types {
    text/html                             html htm shtml;
    text/css                              css;
    text/xml                              xml;
    image/gif                             gif;
    image/jpeg                            jpeg jpg;
    application/javascript                js;
    application/atom+xml                  atom;
    application/rss+xml                   rss;

    text/mathml                           mml;
    text/plain                            txt;
    text/vnd.sun.j2me.app-descriptor      jad;
    text/vnd.wap.wml                      wml;
    text/x-component                      htc;

    image/png                             png;
    image/tiff                            tif tiff;
    image/vnd.wap.wbmp                    wbmp;
    image/x-icon                          ico;
    image/x-jng                           jng;
    image/x-ms-bmp                        bmp;
    image/svg+xml                         svg svgz;
    image/webp                            webp;

    application/font-woff                 woff;
    application/java-archive              jar war ear;
    application/json                      json;
    application/mac-binhex40              hqx;
    application/msword                    doc;
    application/pdf                       pdf;
    application/postscript                ps eps ai;
    application/rtf                       rtf;
    application/vnd.apple.mpegurl         m3u8;
    application/vnd.ms-excel              xls;
    application/vnd.ms-fontobject         eot;
    application/vnd.ms-powerpoint         ppt;
    application/vnd.wap.wmlc              wmlc;
    application/vnd.google-earth.kml+xml  kml;
    application/vnd.google-earth.kmz      kmz;
    application/x-7z-compressed           7z;
    application/x-cocoa                   cco;
    application/x-java-archive-diff       jardiff;
    application/x-java-jnlp-file          jnlp;
    application/x-makeself                run;
    application/x-perl                    pl pm;
    application/x-pilot                   prc pdb;
    application/x-rar-compressed          rar;
    application/x-redhat-package-manager  rpm;
    application/x-sea                     sea;
    application/x-shockwave-flash         swf;
    application/x-stuffit                 sit;
    application/x-tcl                     tcl tk;
    application/x-x509-ca-cert            der pem crt;
    application/x-xpinstall               xpi;
    application/xhtml+xml                 xhtml;
    application/xspf+xml                  xspf;
    application/zip                       zip;

    application/octet-stream              bin exe dll;
    application/octet-stream              deb;
    application/octet-stream              dmg;
    application/octet-stream              iso img;
    application/octet-stream              msi msp msm;

    application/vnd.openxmlformats-officedocument.wordprocessingml.document    docx;
    application/vnd.openxmlformats-officedocument.spreadsheetml.sheet          xlsx;
    application/vnd.openxmlformats-officedocument.presentationml.presentation  pptx;

    audio/midi                            mid midi kar;
    audio/mpeg                            mp3;
    audio/ogg                             ogg;
    audio/x-m4a                           m4a;
    audio/x-realaudio                     ra;

    video/3gpp                            3gpp 3gp;
    video/mp2t                            ts;
    video/mp4                             mp4;
    video/mpeg                            mpeg mpg;
    video/quicktime                       mov;
    video/webm                            webm;
    video/x-flv                           flv;
    video/x-m4v                           m4v;
    video/x-mng                           mng;
    video/x-ms-asf                        asx asf;
    video/x-ms-wmv                        wmv;
    video/x-msvideo                       avi;
}

# configuration file /etc/nginx/conf.d/dokku-installer.conf:
upstream dokku-installer { server 127.0.0.1:2000; }
server {
  listen      80;
  location    / {
    proxy_pass  http://dokku-installer;
  }
}

# configuration file /etc/nginx/conf.d/dokku.conf:
include /home/dokku/*/nginx.conf;

server_tokens off;

# Settings from https://mozilla.github.io/server-side-tls/ssl-config-generator/
ssl_session_cache   shared:SSL:20m;
ssl_session_timeout 1d;
ssl_session_tickets off;

ssl_dhparam /etc/nginx/dhparam.pem;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;

# configuration file /home/dokku/fpm/nginx.conf:

server {
  listen      [::]:80;
  listen      80;
  server_name fpm.yjdwbj.cloudns.org;
  access_log  /var/log/nginx/fpm-access.log;
  error_log   /var/log/nginx/fpm-error.log;

  return 301 https://$host:6443$request_uri;

}

server {
  listen      [::]:6443 ssl http2;
  listen      6443 ssl http2;

  server_name fpm.yjdwbj.cloudns.org;
  access_log  /var/log/nginx/fpm-access.log;
  error_log   /var/log/nginx/fpm-error.log;

  ssl_certificate           /home/dokku/fpm/tls/server.crt;
  ssl_certificate_key       /home/dokku/fpm/tls/server.key;
  ssl_protocols             TLSv1.2 TLSv1.3;
  ssl_prefer_server_ciphers off;

  keepalive_timeout   70;


  location    / {

    gzip on;
    gzip_min_length  1100;
    gzip_buffers  4 32k;
    gzip_types    text/css text/javascript text/xml text/plain text/x-component application/javascript application/x-javascript application/json application/xml  application/rss+xml font/truetype application/x-font-ttf font/opentype application/vnd.ms-fontobject image/svg+xml;
    gzip_vary on;
    gzip_comp_level  6;

    proxy_pass  http://fpm-80;
    http2_push_preload on;
    proxy_http_version 1.1;
    proxy_read_timeout 60s;
    proxy_buffer_size 4096;
    proxy_buffering on;
    proxy_buffers 8 4096;
    proxy_busy_buffers_size 8192;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection $http_connection;
    proxy_set_header Host $http_host;
    proxy_set_header X-Forwarded-For $remote_addr;
    proxy_set_header X-Forwarded-Port $server_port;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Request-Start $msec;

  }


  include /home/dokku/fpm/nginx.conf.d/*.conf;

  error_page 400 401 402 403 405 406 407 408 409 410 411 412 413 414 415 416 417 418 420 422 423 424 426 428 429 431 444 449 450 451 /400-error.html;
  location /400-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 404 /404-error.html;
  location /404-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 500 501 503 504 505 506 507 508 509 510 511 /500-error.html;
  location /500-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 502 /502-error.html;
  location /502-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }
}

upstream fpm-80 {

  server 172.17.0.5:80;
}


# configuration file /home/dokku/fpm/nginx.conf.d/hsts.conf:
add_header Strict-Transport-Security "max-age=15724800; includeSubdomains" always;

# configuration file /home/dokku/mycloud/nginx.conf:

server {
  listen      [::]:80;
  listen      80;
  server_name nc.llccyy.dynv6.net;
  access_log  /var/log/nginx/mycloud-access.log;
  error_log   /var/log/nginx/mycloud-error.log;

  return 301 https://$host:5443$request_uri;

}

server {
  listen      [::]:5443 ssl http2;
  listen      5443 ssl http2;

  server_name nc.llccyy.dynv6.net;
  access_log  /var/log/nginx/mycloud-access.log;
  error_log   /var/log/nginx/mycloud-error.log;

  ssl_certificate           /home/dokku/mycloud/tls/server.crt;
  ssl_certificate_key       /home/dokku/mycloud/tls/server.key;
  ssl_protocols             TLSv1.2 TLSv1.3;
  ssl_prefer_server_ciphers off;

  keepalive_timeout   70;


  location    / {

    gzip on;
    gzip_min_length  1100;
    gzip_buffers  4 32k;
    gzip_types    text/css text/javascript text/xml text/plain text/x-component application/javascript application/x-javascript application/json application/xml  application/rss+xml font/truetype application/x-font-ttf font/opentype application/vnd.ms-fontobject image/svg+xml;
    gzip_vary on;
    gzip_comp_level  6;

    proxy_pass  http://mycloud-80;
    http2_push_preload on;
    proxy_http_version 1.1;
    proxy_read_timeout 60s;
    proxy_buffer_size 4096;
    proxy_buffering on;
    proxy_buffers 8 4096;
    proxy_busy_buffers_size 8192;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection $http_connection;
    proxy_set_header Host $http_host;
    proxy_set_header X-Forwarded-For $remote_addr;
    proxy_set_header X-Forwarded-Port $server_port;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Request-Start $msec;

  }

  client_max_body_size 100m;
  include /home/dokku/mycloud/nginx.conf.d/*.conf;

  error_page 400 401 402 403 405 406 407 408 409 410 411 412 413 414 415 416 417 418 420 422 423 424 426 428 429 431 444 449 450 451 /400-error.html;
  location /400-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 404 /404-error.html;
  location /404-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 500 501 503 504 505 506 507 508 509 510 511 /500-error.html;
  location /500-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 502 /502-error.html;
  location /502-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }
}

upstream mycloud-80 {

  server 172.17.0.4:80;
}


# configuration file /home/dokku/mycloud/nginx.conf.d/hsts.conf:
add_header Strict-Transport-Security "max-age=15724800; includeSubdomains" always;

# configuration file /home/dokku/trojan/nginx.conf:

server {
  listen      [::]:7443 ssl http2;
  listen      7443 ssl http2;

  server_name proxy.yjdwbj.cloudns.org;
  access_log  /var/log/nginx/trojan-access.log;
  error_log   /var/log/nginx/trojan-error.log;

  ssl_certificate           /home/dokku/trojan/tls/server.crt;
  ssl_certificate_key       /home/dokku/trojan/tls/server.key;
  ssl_protocols             TLSv1.2 TLSv1.3;
  ssl_prefer_server_ciphers off;

  keepalive_timeout   70;


  location    / {

    gzip on;
    gzip_min_length  1100;
    gzip_buffers  4 32k;
    gzip_types    text/css text/javascript text/xml text/plain text/x-component application/javascript application/x-javascript application/json application/xml  application/rss+xml font/truetype application/x-font-ttf font/opentype application/vnd.ms-fontobject image/svg+xml;
    gzip_vary on;
    gzip_comp_level  6;

    proxy_pass  https://trojan-443;
    http2_push_preload on;
    proxy_http_version 1.1;
    proxy_read_timeout 60s;
    proxy_buffer_size 4096;
    proxy_buffering on;
    proxy_buffers 8 4096;
    proxy_busy_buffers_size 8192;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection $http_connection;
    proxy_set_header Host $http_host;
    proxy_set_header X-Forwarded-For $remote_addr;
    proxy_set_header X-Forwarded-Port $server_port;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Request-Start $msec;

  }


  include /home/dokku/trojan/nginx.conf.d/*.conf;

  error_page 400 401 402 403 405 406 407 408 409 410 411 412 413 414 415 416 417 418 420 422 423 424 426 428 429 431 444 449 450 451 /400-error.html;
  location /400-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 404 /404-error.html;
  location /404-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 500 501 503 504 505 506 507 508 509 510 511 /500-error.html;
  location /500-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }

  error_page 502 /502-error.html;
  location /502-error.html {
    root /var/lib/dokku/data/nginx-vhosts/dokku-errors;
    internal;
  }
}

upstream trojan-443 {

  server 172.17.0.2:443;
}


# configuration file /home/dokku/trojan/nginx.conf.d/hsts.conf:
add_header Strict-Transport-Security "max-age=15724800; includeSubdomains" always;

# configuration file /etc/nginx/conf.d/server_names_hash_bucket_size.conf:
#server_names_hash_bucket_size 512;

本地端口与上面对应有5443,7443,6443

netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:6443            0.0.0.0:*               LISTEN      23764/nginx: master
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      23764/nginx: master
tcp        0      0 0.0.0.0:2000            0.0.0.0:*               LISTEN      27904/python3
tcp        0      0 0.0.0.0:7443            0.0.0.0:*               LISTEN      23764/nginx: master
tcp        0      0 0.0.0.0:2132            0.0.0.0:*               LISTEN      1027/sshd
tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      2715/dnsmasq
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      23764/nginx: master
tcp        0      0 0.0.0.0:8123            0.0.0.0:*               LISTEN      444/polipo
tcp        0      0 0.0.0.0:5443            0.0.0.0:*               LISTEN      23764/nginx: master
tcp        0      0 0.0.0.0:1443            0.0.0.0:*               LISTEN      1435/ss-server
tcp6       0      0 :::6443                 :::*                    LISTEN      23764/nginx: master
tcp6       0      0 :::80                   :::*                    LISTEN      23764/nginx: master
tcp6       0      0 :::7443                 :::*                    LISTEN      23764/nginx: master
tcp6       0      0 :::2132                 :::*                    LISTEN      1027/sshd
tcp6       0      0 :::53                   :::*                    LISTEN      2715/dnsmasq
tcp6       0      0 :::443                  :::*                    LISTEN      23764/nginx: master
tcp6       0      0 :::5443                 :::*                    LISTEN      23764/nginx: master
tcp6       0      0 :::1443                 :::*                    LISTEN      1435/ss-server

资源聚合

100种不错的免费工具和资源
https://arxiv.org/
- arXiv is a free distribution service and an open-access archive for 1,993,024 scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv.
计算机工程资料索引

谢谢支持

微信二维码:

数据卷(Volumes)

Kubernetes对于数据卷重新定义,提供了丰富的强大的功能.按照功能可分为三类:本地数据卷,网络数据卷,信息数据卷.
Kubernetes提供支持的数据卷类型,最新版本会有增减:
- 本地数据卷:
  - EmptyDir
  - HostPath
- 网络数据卷:
  - NFS
  - iSCSI
  - GlusterFS
  - RBD
  - Flocker
  - GCE Perisistent Disk
  - Aws Elastic Block Store
  - azureDisk
  - CephFS
  - fc (fibre Channel)
  - Persistent Volume Claim
- 信息数据卷:
  - Git Repo(deprecated)
  - Secret
  - Downward API

本地数据卷

`HostPath`

大多数的Pod应该忽略它们主机节点,因此它们不应该访问节点里文件系统上的任何文件.但是某些系统级别的Pod(通常是由DaemonSet管理)确实需要读取节点里的文件,还有在测试环境中可以HostPath来代替一些PV.HostPath卷指向节点里文件系统是的特定文件或目录.在同一个节点上运行并在其HostPath卷中使用相同路径的Pod可以看到相同的文件.如果要在集群里使用HostPath,需要把--enable-hostpath-provisioner参数标志加到kube-controller-manager里启动.

`Ceph`集群

Ceph 中文官档,Ceph 英文官档,这里有详尽权威的文档,最新文档及版本,参看Github Ceph.

概要

Ceph是一个开源项目,它提供软件定义的(SDS),统一的存储解决方案.具有高度可伸缩性,容量可扩展到EB级别.Ceph的技术特性,总体表现在集群的可靠性,集群扩展性,数据安全性,接口统一性4个方面.

数据卷

后端存储可以分为filestore与bluestore:
FileStore:
- FileStore is the legacy approach to storing objects in Ceph. It relies on a standard file system (normally XFS) in combination with a key/value database (traditionally LevelDB, now RocksDB) for some metadata.
- FileStore is well-tested and widely used in production. However, it suffers from many performance deficiencies due to its overall design and its reliance on a traditional file system for object data storage.
- Although FileStore is capable of functioning on most POSIX-compatible file systems (including btrfs and ext4), we recommend that only the XFS file system be used with Ceph. Both btrfs and ext4 have known bugs and deficiencies and their use may lead to data loss. By default, all Ceph provisioning tools use XFS.
BlueStore:
- Key BlueStore features include:
  - Direct management of storage devices. BlueStore consumes raw block devices or partitions. This avoids intervening layers of abstraction (such as local file systems like XFS) that can limit performance or add complexity.
  - Metadata management with RocksDB. RocksDB’s key/value database is embedded in order to manage internal metadata, including the mapping of object names to block locations on disk.
  - Full data and metadata checksumming. By default, all data and metadata written to BlueStore is protected by one or more checksums. No data or metadata is read from disk or returned to the user without being verified.
  - Inline compression. Data can be optionally compressed before being written to disk.
  - Multi-device metadata tiering. BlueStore allows its internal journal (write-ahead log) to be written to a separate, high-speed device (like an SSD, NVMe, or NVDIMM) for increased performance. If a significant amount of faster storage is available, internal metadata can be stored on the faster device.
  - Efficient copy-on-write. RBD and CephFS snapshots rely on a copy-on-write clone mechanism that is implemented efficiently in BlueStore. This results in efficient I/O both for regular snapshots and for erasure-coded pools (which rely on cloning to implement efficient two-phase commits).
- 支持下面的配置:
  - A block device, a block.wal, and a block.db device
  - A block device and a block.wal device
  - A block device and a block.db device
  - A single block device
- block device 也有三种选项:
  - 整个磁盘
  - 磁盘分区
  - 逻辑卷 (a logical volumen of LVM)
注意:
1. 不可以使用磁盘作为block.db或者block.wal,否则会报错:blkid could not detect a PARTUUID for device;
2. 若使用磁盘或者分区作block,则ceph-volume会在其上创建LV来使用.若使用分区作block.db或block.wal,则直接使用分区而不创建LV.
BlueFs将整个BlueStore的存储空间分为三个层次:
- 慢速(Slow)空间:主要用于存储对象数据,可由普通大容量机械盘提供,由BlueStore自行管理
- 高速(DB)空间:存储BlueStore内部产生的元数据，可由普通SSD提供，需求小于(慢速空间).
- 超高速(WAL)空间:主要存储RocksDB内部产生的.log 文件,可由SSD或者NVRAM等时延相较普通SSD更小的设备充当.容量需求和(高速空间)相当，同样由Bluefs直接管理.

`Ceph`的功能组件

Ceph OSD:(Object Storage Device),主要功能包括存储数据,处理数据的复制,恢复,回补平衡数据分布,并将一些相关数据提供给Ceph Monitor,例如 Ceph OSD心跳等.一个Ceph的存储集群,至少需要1个Ceph OSD来实现active+clean健康状态和有效的保存数据的副本(默认情况下是双副本,可以调整).注意:每一个Disk,分区都可以成为一个OSD.
Ceph Monitor:Ceph的监视器,主要功能是维护整个集群健康状态,提供一致性的决策,包含了 Monitor map,OSD map,PG(Placement Group) map和CRUSH map.
Ceph MDS:(Ceph Metadata Server),主要保存的是Ceph文件系统的元数据(metadata).注意:Ceph的块存储与Ceph的对象存储都不需要Ceph MDS.Ceph MDS为基于POSIX文件系统的用户提供了一些基础命令,如:ls,find等.如果需要创建CephFS才需要用到MDS,但是CephFS离生产使用还有一段距离.

Ceph 功能特性

RADOS

RADOS具备自我修复等特性,提供了一个可靠,自动,智能的分布式存储.它的灵魂CRUSH(Controlled Replication Under Scalable Hashing,可扩展哈希算法的可控复制)算法.

`Ceph`文件系统

CephFS功能特性是基于RADOS来现实分布式的文件系统,引入了MDS(Metadata Server),主要为兼容POSIX文件系统提供元数据.一般都是当体系文件系统来挂载.
Ceph文件系统

`Ceph`块设备

RBD(Rados Block Device)功能特性是基于Librados之上,通过Librbd创建一个块设备,通过QEMU/KVM附加到VM上,作为传统的块设备来使用.目前OpenStack,CloudStack等都是采用这种方式来为VM提供块设备,同时也支持快照同COW(Copy On Write)等功能.
Ceph块设备

`Ceph`对象网关

RADWOGW的功能特性是基于LibRADOS之上,提供当前流行的RESTful协议的网关,并且兼容AWS S3和Swift接口,作为对象存储,可以对接网盘类应用以及HLS的流媒体应用等.
体系结构

通过`Ceph/ceph-ansiable`安装

ceph-ansible

关于Releases版本的特别说明:
- stable-3.0支持ceph jewel 和 luminous 版本.该branch需要Ansible 2.4`版本.
- stable-3.1支持ceph luminous 和 mimic 版本.该branch需要Ansible 2.4`版本.
- stable-3.2支持ceph luminous 和 mimic 版本.该branch需要Ansible 2.6`版本.
- stable-4.0支持ceph nautilus 版本.该branch需要Ansible 2.8`版本.
- master支持Ceph@master版本.该branch需要Ansible 2.8版本.

1
2
3

~$ git clone  https://github.com/ceph/ceph-ansible
~$ git checkout -b v3.2.9
~$ pip install -r  ceph-ansible/requirements.txt

通过`Ceph/ceph-deploy`安装

系统环境:
- ceph: version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)
- Debian GNU/Linux 9.9 (stretch)
- ceph-deploy : 2.0.1
链接:

快速安装(apt)

Ceph-deploy是比较旧的部署方式,过程稍复杂.经测试使用apt的方式安装不到ceph-deploy,这里通过pip install ceph-deploy安装成功.
下面使用VirtualBox创建虚拟机来做实验.创建一个Linux虚拟机,安装debian9,设置两个网卡,一个是NAT(10.0.2.0/24)用于外网下载软件使用,一个是Vboxnet1(192.168.99.0/24)集群通信使用.安装一些常的工具软件,并克隆4个新的虚拟机,更改它的主机名与IP地址.下面会用到Ansible来批量操作这些虚拟机.安装虚假机结构如下:

在上述的节点虚拟器机里安装apt get ntp ntpdate ntp-doc时间服务器相关包,并配置好SSH公钥免农密登录.这里可以使用Ansible变量操作.
注意,各个节点里的/etc/hosts要与ceph-deploy操作的主机是一致的.否则会出现ceph-deploy mon create-initial无法进行的错误.

清理旧节点

~$ ceph-deploy purge {ceph-node} [{ceph-node}]
~$ ceph-deploy purgedata {ceph-node} [{ceph-node}]
~$ ceph-deploy forgetkeys
~$ rm ceph.*

安装节点

~$ ceph-deploy new --help
usage: ceph-deploy new [-h] [--no-ssh-copykey] [--fsid FSID]
                       [--cluster-network CLUSTER_NETWORK]
                       [--public-network PUBLIC_NETWORK]
                       MON [MON ...]

Start deploying a new cluster, and write a CLUSTER.conf and keyring for it.

positional arguments:
  MON                   initial monitor hostname, fqdn, or hostname:fqdn pair

optional arguments:
  -h, --help            show this help message and exit
  --no-ssh-copykey      do not attempt to copy SSH keys
  --fsid FSID           provide an alternate FSID for ceph.conf generation
  --cluster-network CLUSTER_NETWORK
                        specify the (internal) cluster network
  --public-network PUBLIC_NETWORK
                        specify the public network for a cluster

~$ ceph-deploy new node1 node2 node3 --public-network 192.168.99.0/24

# 下面命令,脚本会通过ssh到每个节点上安装相应的ceph包.类似于 apt install ceph ceph-base ceph-common ceph-mds ceph-mon ceph-osd radosgw
# 因为使用的是cepy-deplopy 2.0.x,必须要指定为luminous(v12)以上版本,否则它是默认安装Mimic(v10)的版本.现在最新版是 Nautilus(v14.0.2)
~$ ceph-deploy install --release luminous node1 node2 node3
# 可以加入这两个参数，加速安装 --repo-url http://mirrors.ustc.edu.cn/ceph/debian-luminous --gpg-url http://mirrors.ustc.edu.cn/ceph/keys/release.asc
# 上述命令后,会在当前目录下创建ceph.conf,ceph-mon.keyring

Pool,PG(Placement Groups)和CRUSH配置参考,官方文档,下面PG参数的调校可以参照这里PgCalc

# --> ceph.conf
[global]

	# By default, Ceph makes 3 replicas of objects. If you want to make four
	# copies of an object the default value--a primary copy and three replica
	# copies--reset the default values as shown in 'osd pool default size'.
	# If you want to allow Ceph to write a lesser number of copies in a degraded
	# state, set 'osd pool default min size' to a number less than the
	# 'osd pool default size' value.

	osd pool default size = 3  # Write an object 3 times.
	osd pool default min size = 2 # Allow writing two copies in a degraded state.

	# Ensure you have a realistic number of placement groups. We recommend
	# approximately 100 per OSD. E.g., total number of OSDs multiplied by 100
	# divided by the number of replicas (i.e., osd pool default size). So for
	# 10 OSDs and osd pool default size = 4, we'd recommend approximately
	# (100 * 10) / 4 = 250.

	osd pool default pg num = 250
	osd pool default pgp num = 250

初始化Monitors

1 2	~$ ceph-deploy mon create node1 node2 node3 ~$ ceph-deploy gatherkeys node1 node2 node3

注意:如果出现下面的错误，可能是系统的空间小于5%.具体错误细节可以查看/var/log/ceph/ceph-mon.DB001.log

1
2

[DB001][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.DB001.asok mon_status
[DB001][ERROR ] b'admin_socket: exception getting command descriptions: [Errno 2] No such file or directory'

会在当前目录下,创建如下文件
- {cluster-name}.client.admin.keyring
- {cluster-name}.mon.keyring
- {cluster-name}.bootstrap-osd.keyring
- {cluster-name}.bootstrap-mds.keyring
- {cluster-name}.bootstrap-rgw.keyring
- {cluster-name}.bootstrap-mgr.keyring

分发 ceph 配置和 keys 到集群的节点中去:

$ ceph-deploy admin node1 node2 node3
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/lcy/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /home/lcy/.pyenv/versions/py3dev/bin/ceph-deploy admin node1 node2 node3
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  client                        : ['node1', 'node2', 'node3']
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fef98f77390>
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  func                          : <function admin at 0x7fef99bdd6a8>
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to node1
[node1][DEBUG ] connection detected need for sudo
[node1][DEBUG ] connected to host: node1
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to node2
[node2][DEBUG ] connection detected need for sudo
[node2][DEBUG ] connected to host: node2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to node3
[node3][DEBUG ] connection detected need for sudo
[node3][DEBUG ] connected to host: node3

查看集群的状态,可直接登录用root权限运行,或者Ansible命令运行:

~$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
192.168.99.101 | CHANGED | rc=0 >>
  cluster:
    id:     0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   0B used, 0B / 0B avail
    pgs:

`Ceph Manager`部署

参考文档

~$ ceph-deploy mgr create node1 node2 node3
[...]
~$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
192.168.99.101 | CHANGED | rc=0 >>
  cluster:
    id:     0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: node1(active), standbys: node3, node2
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   0B used, 0B / 0B avail
    pgs:

`Ceph OSD`部署

ceph-volume
从Ceph Luminous 12.2.2开始,ceph-disk被弃用,使用ceph-volume代替.

~$ ceph-deploy osd -h
usage: ceph-deploy osd [-h] {list,create} ...

Create OSDs from a data disk on a remote host:

    ceph-deploy osd create {node} --data /path/to/device

For bluestore, optional devices can be used::

    ceph-deploy osd create {node} --data /path/to/data --block-db /path/to/db-device
    ceph-deploy osd create {node} --data /path/to/data --block-wal /path/to/wal-device
    ceph-deploy osd create {node} --data /path/to/data --block-db /path/to/db-device --block-wal /path/to/wal-device

For filestore, the journal must be specified, as well as the objectstore::

    ceph-deploy osd create {node} --filestore --data /path/to/data --journal /path/to/journal

For data devices, it can be an existing logical volume in the format of:
vg/lv, or a device. For other OSD components like wal, db, and journal, it
can be logical volume (in vg/lv format) or it must be a GPT partition.

positional arguments:
  {list,create}
    list         List OSD info from remote host(s)
    create       Create new Ceph OSD daemon by preparing and activating a
                 device

optional arguments:
  -h, --help     show this help message and exit

~$ ceph-deploy osd create -h
usage: ceph-deploy osd create [-h] [--data DATA] [--journal JOURNAL]
                              [--zap-disk] [--fs-type FS_TYPE] [--dmcrypt]
                              [--dmcrypt-key-dir KEYDIR] [--filestore]
                              [--bluestore] [--block-db BLOCK_DB]
                              [--block-wal BLOCK_WAL] [--debug]
                              [HOST]

positional arguments:
  HOST                  Remote host to connect

optional arguments:
  -h, --help            show this help message and exit
  --data DATA           The OSD data logical volume (vg/lv) or absolute path
                        to device
  --journal JOURNAL     Logical Volume (vg/lv) or path to GPT partition
  --zap-disk            DEPRECATED - cannot zap when creating an OSD
  --fs-type FS_TYPE     filesystem to use to format DEVICE (xfs, btrfs)
  --dmcrypt             use dm-crypt on DEVICE
  --dmcrypt-key-dir KEYDIR
                        directory where dm-crypt keys are stored
  --filestore           filestore objectstore
  --bluestore           bluestore objectstore
  --block-db BLOCK_DB   bluestore block.db path
  --block-wal BLOCK_WAL
                        bluestore block.wal path
  --debug               Enable debug mode on remote ceph-volume calls

在node1添加了一个磁盘作OSD盘,下面把它整个盘创建成一个块设备.

1
2
3

~$ ceph-deploy osd create node1 --data /dev/vdb
~$ ceph-deploy osd create node2 --data /dev/vdb
~$ ceph-deploy osd create node3 --data /dev/vdb

出错消息，如果原盘里面有LVM的信息，要先手动清除原LVM信息，不然会出现下面错误.先用lvdisplay查看，再用lvremove --force,vgdisplay,vgremove --force,清除原有的 LVM 信息.

1
2
3

[DB001][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs

清除原LVM信息

1	~$ ansible -i hosts all -b -m shell -a "lvdisplay \| awk 'NR==2 {print $3}'\| xargs lvremove --force ; vgdisplay \| awk 'NR==2 {print $3}' \| xargs vgremove"

查看osd状态

~$ ansible -i ../hosts node1 -b -m command -a "ceph osd stat"
3 osds: 3 up, 3 in

~$ ansible -i ../hosts node1 -b -m command -a "ceph  df"
GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    180GiB     177GiB      3.02GiB          1.68
POOLS:
    NAME                ID     USED        %USED     MAX AVAIL     OBJECTS
    hdd                 1         375B         0       84.0GiB           8
    cephfs_data         2           0B         0       84.0GiB           0
    cephfs_metadata     3      2.19KiB         0       84.0GiB          21

查看osd树

$ ansible -i hosts node1  -b -m command -a "ceph osd tree"

node1 | CHANGED | rc=0 >>
ID CLASS WEIGHT  TYPE NAME       STATUS REWEIGHT PRI-AFF
-1       0.17578 root default
-3       0.05859     host node1
 0   hdd 0.05859         osd.0       up  1.00000 1.00000
-7       0.05859     host node2
 2   hdd 0.05859         osd.2       up  1.00000 1.00000
-5       0.05859     host node3
 1   hdd 0.05859         osd.1       up  1.00000 1.00000

查看系统状态

$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
192.168.99.101 | CHANGED | rc=0 >>
  cluster:
    id:     0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: node1(active), standbys: node3, node2
    osd: 1 osds: 1 up, 1 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   1.00GiB used, 9.00GiB / 10.0GiB avail  # 这里是node1:/dev/sdb刚创建的.
    pgs:
# 如果要使用管道操作,必须使用shell模块,command模块会出错.
~$ ansible -i ../hosts node1 -b -m shell -a "mount | grep ceph"
tmpfs on /var/lib/ceph/osd/ceph-0 type tmpfs (rw,relatime)

~$ ansible -i ../hosts node1 -b -m command -a "ls -l /var/lib/ceph/"
192.168.99.101 | CHANGED | rc=0 >>
total 44
drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 bootstrap-mds
drwxr-xr-x 2 ceph ceph 4096 May  9 21:33 bootstrap-mgr
drwxr-xr-x 2 ceph ceph 4096 May  9 22:27 bootstrap-osd
drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 bootstrap-rbd
drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 bootstrap-rgw
drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 mds
drwxr-xr-x 3 ceph ceph 4096 May  9 21:33 mgr
drwxr-xr-x 3 ceph ceph 4096 May  9 21:22 mon
drwxr-xr-x 3 ceph ceph 4096 May  9 22:27 osd
drwxr-xr-x 2 ceph ceph 4096 Apr 11 08:44 radosgw
drwxr-xr-x 2 ceph ceph 4096 May  9 21:22 tmp

~$ ansible -i ../hosts node1 -b -m command -a "ls -l /var/lib/ceph/osd/ceph-0"
192.168.99.101 | CHANGED | rc=0 >>
total 48
-rw-r--r-- 1 ceph ceph 393 May  9 22:27 activate.monmap
lrwxrwxrwx 1 ceph ceph  93 May  9 22:27 block -> /dev/ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024/osd-block-261c9455-fbc4-4eba-9783-5fba4290048d
-rw-r--r-- 1 ceph ceph   2 May  9 22:27 bluefs
-rw-r--r-- 1 ceph ceph  37 May  9 22:27 ceph_fsid
-rw-r--r-- 1 ceph ceph  37 May  9 22:27 fsid
-rw------- 1 ceph ceph  55 May  9 22:27 keyring
-rw-r--r-- 1 ceph ceph   8 May  9 22:27 kv_backend
-rw-r--r-- 1 ceph ceph  21 May  9 22:27 magic
-rw-r--r-- 1 ceph ceph   4 May  9 22:27 mkfs_done
-rw-r--r-- 1 ceph ceph  41 May  9 22:27 osd_key
-rw-r--r-- 1 ceph ceph   6 May  9 22:27 ready
-rw-r--r-- 1 ceph ceph  10 May  9 22:27 type
-rw-r--r-- 1 ceph ceph   2 May  9 22:27 whoami

查看Ceph的参数配置项

~$ ansible -i ../hosts node1 -b -m command -a "ceph --show-config"
name = client.admin
cluster = ceph
debug_none = 0/5
debug_lockdep = 0/1
[....]

查看LVM相关信息

~$ ansible -i ../hosts node1 -b -m command -a "pvdisplay"
192.168.99.101 | CHANGED | rc=0 >>
  --- Physical volume ---
  PV Name               /dev/sdb
  VG Name               ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024
  PV Size               10.00 GiB / not usable 4.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              2559
  Free PE               0
  Allocated PE          2559
  PV UUID               Qd6kSs-Ivbp-3APy-21Tv-XQgx-EhBn-XfioVa

~$ ansible -i ../hosts node1 -b -m command -a "vgdisplay"
192.168.99.101 | CHANGED | rc=0 >>
  --- Volume group ---
  VG Name               ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  17
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               1
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               10.00 GiB
  PE Size               4.00 MiB
  Total PE              2559
  Alloc PE / Size       2559 / 10.00 GiB
  Free  PE / Size       0 / 0
  VG UUID               XiVkQ6-aUPv-3BRw-Gj1N-jdG4-HRxf-hCS3Mg

~$ ansible -i ../hosts node1 -b -m command -a "lvdisplay"
192.168.99.101 | CHANGED | rc=0 >>
  --- Logical volume ---
  LV Path                /dev/ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024/osd-block-261c9455-fbc4-4eba-9783-5fba4290048d
  LV Name                osd-block-261c9455-fbc4-4eba-9783-5fba4290048d
  VG Name                ceph-195012d6-0c8a-45bf-964c-3ac15f2cd024
  LV UUID                F9dF0S-qwb7-LJC0-vld2-TF6g-nP8q-9ncsdI
  LV Write Access        read/write
  LV Creation host, time node1, 2019-05-09 22:27:44 -0400
  LV Status              available
  # open                 4
  LV Size                10.00 GiB
  Current LE             2559
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0

关于LVM的关系.简单描述一下:LV是建立在VG上,VG建立在PV上面.

下面关闭node2,为添加一块20G的盘,测试其它BlueStore类型.

~$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
192.168.99.101 | CHANGED | rc=0 >>
  cluster:
    id:     0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_WARN
            1/3 mons down, quorum node1,node3 # 警告有一个节点shutdown.

  services:
    mon: 3 daemons, quorum node1,node3, out of quorum: node2
    mgr: node1(active), standbys: node3
    osd: 1 osds: 1 up, 1 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   1.00GiB used, 9.00GiB / 10.0GiB avail
    pgs:

`Parted`(GPT 分区)

如果使用fdisk(MBR)分区会报错,下面使用parted(GPT)分区.

root@node2:~# parted /dev/sdb
GNU Parted 3.2
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
(parted) print
Model: ATA VBOX HARDDISK (scsi)
Disk /dev/sdb: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start  End  Size  File system  Name  Flags

(parted) mkpart parimary 0 10G
Warning: The resulting partition is not properly aligned for best performance.
Ignore/Cancel?
Ignore/Cancel? Ignore
(parted) print
Model: ATA VBOX HARDDISK (scsi)
Disk /dev/sdb: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size     File system  Name      Flags
 1      17.4kB  10.0GB  10000MB               parimary

(parted) mkpart parimary 10G 21.5G
(parted) p
Model: ATA VBOX HARDDISK (scsi)
Disk /dev/sdb: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size     File system  Name      Flags
 1      17.4kB  10.0GB  10000MB               parimary
 2      10.0GB  21.5GB  11.5GB                parimary
(parted) q

root@node2:~# partx /dev/sdb
NR    START      END  SECTORS  SIZE NAME     UUID
 1       34 19531250 19531217  9.3G parimary a8c625b7-ebf2-4ceb-a9fd-5371dde59b35
 2 19531776 41940991 22409216 10.7G parimary 6463703f-c1f3-4ad7-8870-ed634db64131

root@node2:~# lsblk /dev/sdb
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sdb      8:16   0   20G  0 disk
├─sdb1   8:17   0  9.3G  0 part
└─sdb2   8:18   0 10.7G  0 part

添加osd.1(block,block.db)

~$ ceph-deploy osd create node2 --data /dev/sdb2 --block-db /dev/sdb1
[...]
[node2][INFO  ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb2 --block.db /dev/sdb1
[node2][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
[node2][INFO  ] checking OSD status...
[node2][INFO  ] Running command: sudo /usr/bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host node2 is now ready for osd use.

# 查看集群状态.
~$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
192.168.99.101 | CHANGED | rc=0 >>
  cluster:
    id:     0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: node1(active), standbys: node3, node2
    osd: 2 osds: 2 up, 2 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   2.00GiB used, 18.7GiB / 20.7GiB avail
    pgs:
# 查看node1上机的ceph-1
~$ ansible -i ../hosts node2 -b -m command -a "ls -l /var/lib/ceph/osd/ceph-1"
192.168.99.102 | CHANGED | rc=0 >>
total 48
-rw-r--r-- 1 ceph ceph 393 May  9 23:53 activate.monmap
lrwxrwxrwx 1 ceph ceph  93 May  9 23:53 block -> /dev/ceph-98f53d51-8e74-4ca3-8b7a-87570c01733e/osd-block-f572ef53-805e-48ff-b936-da520e46be6b
lrwxrwxrwx 1 ceph ceph   9 May  9 23:53 block.db -> /dev/sdb1
-rw-r--r-- 1 ceph ceph   2 May  9 23:53 bluefs
-rw-r--r-- 1 ceph ceph  37 May  9 23:53 ceph_fsid
-rw-r--r-- 1 ceph ceph  37 May  9 23:53 fsid
-rw------- 1 ceph ceph  55 May  9 23:53 keyring
-rw-r--r-- 1 ceph ceph   8 May  9 23:53 kv_backend
-rw-r--r-- 1 ceph ceph  21 May  9 23:53 magic
-rw-r--r-- 1 ceph ceph   4 May  9 23:53 mkfs_done
-rw-r--r-- 1 ceph ceph  41 May  9 23:53 osd_key
-rw-r--r-- 1 ceph ceph   6 May  9 23:53 ready
-rw-r--r-- 1 ceph ceph  10 May  9 23:53 type
-rw-r--r-- 1 ceph ceph   2 May  9 23:53 whoami

创建`MDS`服务器

~$ ceph-deploy mds create  FE001 DIG001
# 查看状态
~$ ansible -i ../hosts node1 -b -m command -a "ceph mds stat"

~$ ansible -i ../hosts node1 -b -m command -a "ceph osd pool create cephfs_data 64 64"
pool 'cephfs_data' created

~$ ansible -i ../hosts node1 -b -m command -a "ceph osd pool create cephfs_metadata 64 64"
pool 'cephfs_metadata' created

# 创建文件系统
~$ ansible -i ../hosts node1 -b -m command -a "ceph fs new cephfs cephfs_metadata cephfs_data"
new fs with metadata pool 3 and data pool 2

~$ ansible -i ../hosts node1 -b -m command -a "ceph mds stat"
cephfs-1/1/1 up  {0=DIG001=up:active}, 1 up:standby
~$ ansible -i ../hosts node1 -b -m command -a "ceph fs ls"
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]

~$ ansible -i ../hosts node1 -b -m command -a "ceph fs status"
cephfs - 0 clients
======
+------+--------+--------+---------------+-------+-------+
| Rank | State  |  MDS   |    Activity   |  dns  |  inos |
+------+--------+--------+---------------+-------+-------+
|  0   | active | DIG001 | Reqs:    0 /s |   10  |   12  |
+------+--------+--------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 2246  | 83.9G |
|   cephfs_data   |   data   |    0  | 83.9G |
+-----------------+----------+-------+-------+

+-------------+
| Standby MDS |
+-------------+
|    FE001    |
+-------------+
MDS version: ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)

# 查看元数据.
~$ sudo ceph osd metadata osd.2

挂载到文件系统，挂载文件系统，可以使用/etc/ceph/ceph.client.admin.keyring里的 key,也可以按照下面，新建一个用户与 key.

~$ sudo ceph auth get-or-create client.cephfs mon 'allow r' mds 'allow rw' osd 'allow rw pool=cephfs-data, allow rw pool=cephfs-metadata'
~$ sudo ceph auth get client.cephfs
exported keyring for client.cephfs
[client.cephfs]
	key = AQDAwhldGXL3GhAAGsHu3XYUIwzS6z0SOcLMFA==
	caps mds = "allow rw"
	caps mon = "allow r"
	caps osd = "allow rw pool=cephfs-data, allow rw pool=cephfs-metadata"

~$ sudo mount.ceph node1:6789:/ /data -o name=cephfs,secret=AQDAwhldGXL3GhAAGsHu3XYUIwzS6z0SOcLMFA==

上述的挂载的方式，会在Shell里看到key,不安全.可以把AQDAwhldGXL3GhAAGsHu3XYUIwzS6z0SOcLMFA==这个Base64的密钥字段保存成一个文件，加上chmod 400的权限.

~$ sudo mount.ceph node1:6789:/ /data -o name=cephfs,secretfile=/etc/ceph/cephfs.secret

# 加入自动挂载
~$ echo "mon1:6789,mon2:6789,mon3:6789:/ /cephfs ceph name=cephfs,secretfile=/etc/ceph/cephfs.secret,_netdev,noatime 0 0" | sudo tee -a /etc/fstab

[2019-07-01 14:20:45,567][ceph_volume.process][INFO ] stdout ceph.block_device=/dev/ceph-bd417a6a-cef6-4ff5-828a-5b68ec8843f0/osd-block-dcde5f54-c555-41ee-8c20-586f1069bcb7,ceph.block_uuid=wHZD0b-lU7P-vYFg-XOBI-zknV-Q181-0xKtt3,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=d7f63adc-33d1-4ae9-9ba7-ae401950d965,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=dcde5f54-c555-41ee-8c20-586f1069bcb7,ceph.osd_id=1,ceph.type=block,ceph.vdo=0";"/dev/ceph-bd417a6a-cef6-4ff5-828a-5b68ec8843f0/osd-block-dcde5f54-c555-41ee-8c20-586f1069bcb7";"osd-block-dcde5f54-c555-41ee-8c20-586f1069bcb7";"ceph-bd417a6a-cef6-4ff5-828a-5b68ec8843f0";"wHZD0b-lU7P-vYFg-XOBI-zknV-Q181-0xKtt3";"60.00g
[2019-07-01 14:20:45,567][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, \*\*kw)
File "/usr/lib/python2.7/dist-packages/ceph_volume/main.py", line 148, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, in dispatch
instance.main()
File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/main.py", line 40, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, in dispatch
instance.main()
File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/trigger.py", line 70, in main
Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main()
File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 339, in main
self.activate(args)
File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(\*a, **kw)
File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 249, in activate
raise RuntimeError('could not find osd.%s with fsid %s' % (osd_id, osd_fsid))
RuntimeError: could not find osd.1 with fsid 3aeba7b7-f539-4b6a-afac-fc9fd62b90fa

1
2
3

~$ sudo lvs -o lv_tags
LV Tags
 ceph.block_device=/dev/ceph-9c0a0bae-d6db-498a-bf20-fe4cd8bdb3a9/osd-block-5c5a950b-8b36-4935-be8c-b59c24073874,ceph.block_uuid=yY970H-ztZ4-VtfA-2L9d-k3cF-Zi44-0i8MB1,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=d7f63adc-33d1-4ae9-9ba7-ae401950d965,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=5c5a950b-8b36-4935-be8c-b59c24073874,ceph.osd_id=2,ceph.type=block,ceph.vdo=0

可以在Ceph集群之外的服务器来安装RGW,需要安装ceph-radosgw包,如:ceph-deploy install --rgw <rgw-node> [<rgw-node>...].下面为了方便起见,我直接在node3,node4上面安装RGW.
添加一个mon节点
1
~$ ceph-deploy mon add node4

# node3在前面被管理过了.
~$ ceph-deploy admin node4
~$ ceph-deploy rgw create node3 node4
[...]
~$ ansible -i ../hosts node3 -b -m shell -a "ps -ef | grep rgw"
192.168.99.103 | CHANGED | rc=0 >>
ceph        4272       1  0 02:05 ?        00:00:00 /usr/bin/radosgw -f --cluster ceph --name client.rgw.node3 --setuser ceph --setgroup ceph
root        5040    5039  0 02:06 pts/0    00:00:00 /bin/sh -c ps -ef | grep rgw
root        5042    5040  0 02:06 pts/0    00:00:00 grep rgw

~$ ansible -i ../hosts node4 -b -m shell -a "ps -ef | grep rgw"
192.168.99.104 | CHANGED | rc=0 >>
ceph        3411       1  0 02:05 ?        00:00:00 /usr/bin/radosgw -f --cluster ceph --name client.rgw.node4 --setuser ceph --setgroup ceph
root        4211    4210  0 02:07 pts/0    00:00:00 /bin/sh -c ps -ef | grep rgw
root        4213    4211  0 02:07 pts/0    00:00:00 grep rgw

# http 测试访问
~$ curl node3:7480
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>

修改 RGW 的默认端口.在 ceph.conf 加入下面两行

1
2
3

[client.rgw.node4]
# rgw_frontends = "civetweb port=80"
rgw_frontends = civetweb port=80+443s ssl_certificate=/etc/ceph/private/keyandcert.pem

上传当前目录下的配置文件到指定的在节点上去.

1 2	~$ ceph-deploy --overwrite-conf config push node4 ~$ ansible -i hosts node4 -b -m systemd -a "name=radosgw state=restarted daemon_reload=yes"

`civetweb`配置

如果不是通过ceph-deploy部署的集群需要通过下面的流程,手动配置添加RGW

# 创建keyring
~$ sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.radosgw.keyring
# 生成密钥匙
~$ sudo ceph-authtool /etc/ceph/ceph.client.radosgw.keyring -n client.rgw.node3 --gen-key
#　设置权限
~$ sudo ceph-authtool -n client.rgw.node3 --cap osd 'allow rwx' --cap mon 'allow rwx' /etc/ceph/ceph.client.radosgw.keyring
# 导入keyring到集群中
~$ sudo ceph -k /etc/ceph/ceph.client.admin.keyring auth add client.rgw.node3 -i /etc/ceph/ceph.client.radosgw.keyring
~$ cat /etc/ceph/ceph.conf
[...]
[client.rgw.node3]
rgw_frontends = civetweb port=80
host=node3
rgw_s3_auth_use_keystone=false
keyring=/etc/ceph/ceph.client.radosgw.keyring
log file=/var/log/ceph/client.radosgw.gateway.log

这里是通过ceph-deploy部署的,只需导出相应到/etc/ceph/ceph.client.radosgw.keyring

~$ sudo ceph auth get client.rgw.node3
exported keyring for client.rgw.node3
# 把下面这行复制,并创建到 /etc/ceph/ceph.client.radosgw.keyring
[client.rgw.node3]
	key = AQC8FNVcl07ALRAAfhr+APpuKW/VvknEzD7hpg==
	caps mon = "allow rw"
	caps osd = "allow rwx"

测试访问

1	~$ sudo radosgw --cluster ceph --name client.rgw.node3 --setuser ceph --setgroup ceph -d --debug_ms 1 --keyring /etc/ceph/ceph.client.radosgw.keyring

如果一切正常开启,就使用systemctl restart ceph-radosgw@rgw.node3重启它的服务,如果服务有错,使用journalctl -u ceph-radosgw@rgw.node3查看.

客户端访问

创建`S3`用户

$ ansible -i ../hosts node4 -b -m command -a "radosgw-admin user create --uid=\"lcy\" --display-name=\"admin user test\""
192.168.99.104 | CHANGED | rc=0 >>
{
    "user_id": "lcy",
    "display_name": "admin user test",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "auid": 0,
    "subusers": [],
    "keys": [
        {
            "user": "lcy",
            "access_key": "74I2DQ89N5EL1OGCCSCV",   # s3cmd必须提供
            "secret_key": "ePz9ONOrZS4BB8RN44KBYxCzRA0UNz8Kyu5kXzvE" # s3cmd必须提供
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw"
}
~$ ansible -i ../hosts node4 -b -m command -a "radosgw-admin user list"
192.168.99.104 | CHANGED | rc=0 >>
[
    "testuser",
    "lcy"
]

创建`Swift`用户

Swift用户是作为子用户被创建,因此要先创建用户,如下:lcy

~$ ansible -i ../hosts node4 -b -m command -a "radosgw-admin subuser create --uid=lcy --subuser=lcy:swift --access=full"
192.168.99.104 | CHANGED | rc=0 >>
{
    "user_id": "lcy",
    "display_name": "admin user test",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "auid": 0,
    "subusers": [
        {
            "id": "lcy:swift",
            "permissions": "full-control"
        }
    ],
    "keys": [
        {
            "user": "lcy",
            "access_key": "74I2DQ89N5EL1OGCCSCV",
            "secret_key": "ePz9ONOrZS4BB8RN44KBYxCzRA0UNz8Kyu5kXzvE"
        }
    ],
    "swift_keys": [
        {
            "user": "lcy:swift",
            "secret_key": "bw2zByEnhZMzpSvrb9tYi5rjOT8mK69SkuuWFN8j"
        }
    ],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw"
}

使用`Python`客户端库测试

~$ pip install boto python-swiftclient
~$ ipython
In [1]: access_key = '74I2DQ89N5EL1OGCCSCV'
In [2]: secret_key = 'ePz9ONOrZS4BB8RN44KBYxCzRA0UNz8Kyu5kXzvE'
In [3]: import boto.s3.connection
In [4]: conn = boto.connect_s3(aws_access_key_id=access_key,aws_secret_access_key=secret_key,host='192.168.99.103',port=7480,is_secure=False,calling_format=boto.s3.connection.OrdinaryCallingFormat())
In [5]: bkt = conn.create_bucket('ooo-bucket')
In [6]: for bkt in conn.get_all_buckets():
   ...:     print("{name} {created}".format(name=bkt.name,created=bkt.creation_date))
   ...:                                                                                                                                                                   # 创建并获取成功.
ooo-bucket 2019-05-10T07:08:26.456Z
# 使用swift client 测试.
~$ swift -A http://node4:7480/auth/1.0 -U lcy:swift -K 'bw2zByEnhZMzpSvrb9tYi5rjOT8mK69SkuuWFN8j' list
ooo-bucket

使用`s3cmd`测试

使用s3cmd之前,需要先使用s3cmd --configure交互设置好相应的参数.这里跳过直接写入一些必要的连接参数.这里可以配置任一rgw(node3,node4)节点测试.

~$ sudo apt instal s3cmd
~$ cat <<EOF > ~/.s3cfg
[default]
access_key = 74I2DQ89N5EL1OGCCSCV
host_base = node3:7480
host_bucket = node3:7480/%(bucket)
secret_key = ePz9ONOrZS4BB8RN44KBYxCzRA0UNz8Kyu5kXzvE
cloudfront_host = node3:7480
use_https = False
bucket_location = US
EOF

# 列出所有桶
~$ s3cmd ls
2019-05-10 07:08  s3://ooo-bucket
# 创建桶
~$ s3cmd mb s3://sql
# 上传文件进桶
~$ s3cmd put ~/wxdb-20190422-1638.sql  s3://sql
upload: '/home/lcy/wxdb-20190422-1638.sql' -> 's3://sql/wxdb-20190422-1638.sql'  [1 of 1]
 197980 of 197980   100% in    1s   104.33 kB/s  done
# 列出桶里的文件
~$ s3cmd ls s3://sql
2019-05-10 08:12    197980   s3://sql/wxdb-20190422-1638.sql
# 下载桶里的文件到本地.
~$ s3cmd get s3://sql/wxdb-20190422-1638.sql
download: 's3://sql/wxdb-20190422-1638.sql' -> './wxdb-20190422-1638.sql'  [1 of 1]
 197980 of 197980   100% in    0s    57.23 MB/s  done

查看集群利用率统计

~$ ansible -i ../hosts node2 -b -m command -a "ceph df"
192.168.99.102 | CHANGED | rc=0 >>
GLOBAL:
    SIZE        AVAIL       RAW USED     %RAW USED
    20.7GiB     18.7GiB      2.01GiB          9.72
POOLS:
    NAME                          ID     USED        %USED     MAX AVAIL     OBJECTS
    .rgw.root                     1      2.08KiB         0       5.83GiB           6
    default.rgw.control           2           0B         0       5.83GiB           8
    default.rgw.meta              3      2.13KiB         0       5.83GiB          12
    default.rgw.log               4           0B         0       5.83GiB         207
    default.rgw.buckets.index     5           0B         0       5.83GiB           3
    default.rgw.buckets.data      6       193KiB         0       5.83GiB           1

`s3fs-fuse`挂载文件系统

s3fs-fuse
Django 使用 AWS S3 存储文件参考

原本想直接把ceph s3 bucket做为一个卷挂到 docker 上面,暂时没试验成功.下是如在的宿主机里挂载,再通过-v挂到 docker 上.

~$ sudo apt install s3fs fuse

# 也可把它放在/etc/passwd-s3fs
~$ echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > ${HOME}/.passwd-s3fs && chmod 600 ${HOME}/.passwd-s3fs
~$ s3cmd ls
2019-05-10 08:10  s3://iso
2019-05-16 03:50  s3://media  # 下面将把它挂载成一个文件目录.
2019-05-10 07:08  s3://ooo-bucket
2019-05-16 06:44  s3://public
2019-05-10 08:10  s3://sql

# 这里需注,ceph s3 必需使用use_path_request_style参数,因为它不是AWS原生的.
~$ s3fs media /data/s3fs -o allow_other,umask=022,use_path_request_style,url=http://node3

~$ df -h | grep s3fs
s3fs            256T     0  256T   0% /data/s3fs

~$ grep s3fs /etc/mtab
s3fs /data/s3fs fuse.s3fs rw,nosuid,nodev,relatime,user_id=1000,group_id=120,allow_other 0 0

# 如是挂载不加umask,默认是0000,无访问权限.
~$ ls -l /data/s3fs/
total 9397
drwxr-xr-x 1 root root       0 Jan  1  1970 hls
-rwxr-xr-x 1 root root 3100721 May 16 14:24 video.mp4

如果需要调试问题,可加入-o dbglevel=info -f -o curldbg启动,具体还有其它功能,可以详查看它的github以及它的帮助命令.

警告错误类

参考文档

$ ansible -i ../hosts node1 -b -m command -a "ceph -s"
192.168.99.101 | CHANGED | rc=0 >>
  cluster:
    id:     0bf150da-b691-4382-bf3d-600e90c19fba
    health: HEALTH_WARN
            Degraded data redundancy: 237/711 objects degraded (33.333%), 27 pgs degraded, 48 pgs undersized

  services:
    mon: 4 daemons, quorum node1,node2,node3,node4
    mgr: node1(active), standbys: node2, node3
    osd: 2 osds: 2 up, 2 in
    rgw: 2 daemons active

  data:
    pools:   6 pools, 48 pgs
    objects: 237 objects, 198KiB
    usage:   2.01GiB used, 18.7GiB / 20.7GiB avail
    pgs:     237/711 objects degraded (33.333%)
             27 active+undersized+degraded
             21 active+undersized

根据上面的警告,数据中的pg降级,重启OSD的节点服务systemctl restart ceph-osd.target之后再看.
后来仔细查看发现,是因为osd的备份数量是3,而我这里只创建了两个osd,所以才会出现上述降级警告.可以修改备份数量为2,也可以再增加一个osd节点.
下面也参照node2一样添加一个 20G 的盘,分成两个区,使用(block,block.wal)方式创建.

~$ ceph-deploy osd create node3 --data /dev/sdb2 --block-wal /dev/sdb1

~$ ansible -i ../hosts node1 -b -m command -a "ceph osd tree"
192.168.99.101 | CHANGED | rc=0 >>
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       0.03058 root default
-3       0.00980     host node1
 0   hdd 0.00980         osd.0      up  1.00000 1.00000
-5       0.01039     host node2
 1   hdd 0.01039         osd.1      up  1.00000 1.00000
-7       0.01039     host node3
 2   hdd 0.01039         osd.2      up  1.00000 1.00000

# 再次查看,状态已经正常了.
~$ ansible -i ../hosts node1 -b -m command -a "ceph health"
192.168.99.101 | CHANGED | rc=0 >>
HEALTH_OK

Ceph: HEALTH_WARN clock skew detected

1 2	# 把所有节点的ntp默认开机启动. ~$ ansible -i hosts all -b -m systemd -a "name=ntp enabled=yes state=started"

application not enabled on 1 pool(s) 警告处理

~$ sudo ceph health detail
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
    application not enabled on pool 'kube'
    use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications.
~$ sudo ceph osd pool application enable kube rbd
enabled application 'rbd' on pool 'kube'

安装完成后各节点的服务列表如下:

$ ansible -i hosts node -b -m command -a "netstat -tnlp"
192.168.99.102 | CHANGED | rc=0 >>
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 192.168.99.102:6789     0.0.0.0:*               LISTEN      476/ceph-mon
tcp        0      0 192.168.99.102:6800     0.0.0.0:*               LISTEN      875/ceph-osd
tcp        0      0 192.168.99.102:6801     0.0.0.0:*               LISTEN      875/ceph-osd
tcp        0      0 192.168.99.102:6802     0.0.0.0:*               LISTEN      875/ceph-osd
tcp        0      0 192.168.99.102:6803     0.0.0.0:*               LISTEN      875/ceph-osd
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      529/sshd
tcp6       0      0 :::22                   :::*                    LISTEN      529/sshd

192.168.99.101 | CHANGED | rc=0 >>
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 192.168.99.101:6789     0.0.0.0:*               LISTEN      480/ceph-mon
tcp        0      0 192.168.99.101:6800     0.0.0.0:*               LISTEN      1015/ceph-osd
tcp        0      0 192.168.99.101:6801     0.0.0.0:*               LISTEN      1015/ceph-osd
tcp        0      0 192.168.99.101:6802     0.0.0.0:*               LISTEN      1015/ceph-osd
tcp        0      0 192.168.99.101:6803     0.0.0.0:*               LISTEN      1015/ceph-osd
tcp        0      0 192.168.99.101:6804     0.0.0.0:*               LISTEN      476/ceph-mgr
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      537/sshd
tcp6       0      0 :::22                   :::*                    LISTEN      537/sshd

192.168.99.103 | CHANGED | rc=0 >>
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 192.168.99.103:6789     0.0.0.0:*               LISTEN      479/ceph-mon
tcp        0      0 192.168.99.103:6800     0.0.0.0:*               LISTEN      965/ceph-osd
tcp        0      0 192.168.99.103:6801     0.0.0.0:*               LISTEN      965/ceph-osd
tcp        0      0 192.168.99.103:6802     0.0.0.0:*               LISTEN      965/ceph-osd
tcp        0      0 192.168.99.103:6803     0.0.0.0:*               LISTEN      965/ceph-osd
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      527/sshd
tcp        0      0 0.0.0.0:7480            0.0.0.0:*               LISTEN      480/radosgw
tcp6       0      0 :::22                   :::*                    LISTEN      527/sshd

192.168.99.104 | CHANGED | rc=0 >>
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 192.168.99.104:6789     0.0.0.0:*               LISTEN      445/ceph-mon
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      447/radosgw
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      515/sshd
tcp6       0      0 :::22                   :::*                    LISTEN      515/sshd

与`Kubernetes`集成

Using Existing Ceph Cluster for kubernetes Persistent Storage

创建`RBD`

操作RBD必须直接登录到服务器里操作,ceph-deploy没有提供相关的接口.可以使用Ansible进行远程指操作.

#　关于如何计算池的pg数,可以参考　https://ceph.com/pgcalc/
~$ sudo ceph  osd pool create kube 64 64
pool 'kube' created

# 设置存储池的副本数
~$ sudo ceph osd pool set kube size 2

~$ sudo ceph osd lspools
1 .rgw.root,2 default.rgw.control,3 default.rgw.meta,4 default.rgw.log,5 default.rgw.buckets.index,6 default.rgw.buckets.data,7 volumes,8 kube,

~$ sudo rbd create kube/cephimage2 --size 40960
~$ sudo rbd list kube
cephimage2

~$ sudo rbd info kube/cephimage2
rbd image 'cephimage2':
        size 40GiB in 10240 objects
        order 22 (4MiB objects)
        block_name_prefix: rbd_data.519a06b8b4567
        format: 2
        #
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        flags:
        create_timestamp: Mon May 13 01:44:35 2019

~$ sudo rbd create kube/cephimage1 --size 10240

# 把 cepimage1原来10G大小,扩展至20G
~$ sudo rbd resize kube/cephimage1 --size 20480

-$ sudo rbd create kube/cephimage3 --size 4096 --image-feature layering

默认创建RBD的会开启(layering, exclusive-lock, object-map, fast-diff, deep-flatten)特性,低版本的linux kernel会不支持,一般低版本仅支持layering特性.如果内核版本过低,创建Pod时会出现下面的要错误.

MountVolume.WaitForAttach failed for volume "ceph-rbd-pv" : rbd: map failed exit status 6, rbd output: rbd: sysfs write failed RBD image feature set mismatch. Try disabling features unsupported by the kernel with "rbd feature disable". In some cases useful info is found in syslog - try "dmesg | tail". rbd: map failed: (6) No such device or address

~# dmesg
[1355258.253726] rbd: image foo: image uses unsupported features: 0x38

创建集群Pod

~$ git clone https://github.com/kubernetes/examples.git
~$ cd examples/staging/volumes/rbd/
~$ tree
.
├── rbd-with-secret.yaml
├── rbd.yaml
├── README.md
└── secret
    └── ceph-secret.yaml

修改rbd-with-secret.yaml的内容如下:

apiVersion: v1
kind: Pod
metadata:
  name: rbd2
spec:
  containers:
    - image: busybox
      command: ["sleep", "60000"]
      name: rbd-rw
      volumeMounts:
      - name: rbdpd
        mountPath: /mnt/rbd
  volumes:
    - name: rbdpd
      rbd:
        monitors:
        - '192.168.99.101:6789'
        - '192.168.99.102:6789'
        - '192.168.99.103:6789'
        - '192.168.99.104:6789'
        pool: kube
        image: cephimage3
        fsType: ext4
        readOnly: false
        user: admin
        secretRef:
          name: ceph-secret

修改ceph-secret,注意替换文件里的key字段.

~$ ansible -i hosts node1 -b -m command -a "cat /etc/ceph/ceph.client.admin.keyring" | grep key | awk '{printf "%s",$NF}' | base64
QVFESDB0UmNFSStwR3hBQUJ4aW1ZT1VXRWVTckdzSStpZklCOWc9PQ==
~$ cat secret/ceph-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: ceph-secret
type: "kubernetes.io/rbd"
data:
  key: QVFESDB0UmNFSStwR3hBQUJ4aW1ZT1VXRWVTckdzSStpZklCOWc9PQ==    # 来源于上面的命令输出.

创建Pod与Secret

~$ kubectl create -f secret/ceph-secret.yaml
~$ kubectl create -f rbd-with-secret

~$ kubectl get pods -o wide
NAME      READY   STATUS    RESTARTS   AGE   IP           NODE    NOMINATED NODE   READINESS GATES
rbd2      1/1     Running   0          60m   10.244.1.2   node2   <none>           <none>
~$ kubectl get secret
NAME                  TYPE                                  DATA   AGE
ceph-secret           kubernetes.io/rbd                     1      17h

# 这样就像使用本的盘一样了.
~$ kubectl exec -it rbd2 -- df -h | grep -e "rbd0" -e "secret"
/dev/rbd0                 3.9G     16.0M      3.8G   0% /mnt/rbd
tmpfs                   498.2M     12.0K    498.2M   0% /var/run/secrets/kubernetes.io/serviceaccount

创建基于RBD的PV以及PVC测试

~$ cat rbd-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: ceph-rbd-pv
spec:
  capacity:
    storage: 4Gi
  accessModes:
    - ReadWriteOnce
  rbd:
    monitors:
    - '192.168.99.101:6789'
    - '192.168.99.102:6789'
    - '192.168.99.103:6789'
    - '192.168.99.104:6789'
    pool: kube
    image: cephimage1
    fsType: ext4
    readOnly: false
    user: admin
    secretRef:
      name: ceph-secret
  persistentVolumeReclaimPolicy: Recycle

~$ cat rbd-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ceph-rbd-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 4Gi

~$ kubectl create -f rbd-pv.yaml
~$ kubectl create -f rbd-pvc.yaml
~$ kubectl get pv
NAME          CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                  STORAGECLASS   REASON   AGE
ceph-rbd-pv   4Gi        RWO            Recycle          Bound    default/ceph-rbd-pvc                           17h
~$ kubectl get pvc
NAME           STATUS   VOLUME        CAPACITY   ACCESS MODES   STORAGECLASS   AGE
ceph-rbd-pvc   Bound    ceph-rbd-pv   4Gi        RWO                           17h

`Ceph-Ansible`安装方式

ceph/ceph-ansible

关于新以太网名命方式

新的Linux会以enp0s10这样的方式来替换旧式的ethX命名方式.具体可参照这里 PredictableNetworkInterfaceNames,Understanding systemd’s predictable network device names

enp0s10:
| | |
v | |
en| |   --> ethernet
  v |
  p0|   --> bus number (0)
    v
    s10 --> slot number (10)

如果不习惯新式的命名可以通过下面三方法改成旧式的命名方式
You basically have three options:
1. You disable the assignment of fixed names, so that the unpredictable kernel names are used again. For this, simply mask udev’s .link file for the default policy: ln -s /dev/null /etc/systemd/network/99-default.link
2. You create your own manual naming scheme, for example by naming your interfaces “internet0”, “dmz0” or “lan0”. For that create your own .link files in /etc/systemd/network/, that choose an explicit name or a better naming scheme for one, some, or all of your interfaces. See systemd.link(5) for more information.
3. You pass the net.ifnames=0 on the kernel command line

查看虚拟机

~$ VBoxManage list vms
"k8s-master" {7bfb1ca4-3ccc-4a1a-8548-7759424df181}
"k8s-node1" {4c29c029-4f93-4463-b83d-4ae9e728e9df}
"k8s-node2" {87a2196c-cf3c-472a-9ffa-f5b8c3e09009}
"k8s-node3" {af9e34cf-a7c9-45d8-ad15-f37d409bcdac}
"k8s-node4" {1f46e865-01c1-4a81-a947-cc267c744756}

# 使用 VBoxHeadles启动上述虚拟机,它不会出现窗口.
~$ VBoxHeadless --startvm k8s-master

下面是参照官网来安装ceph-deploy.但是使用apt找不到ceph-deploy包名.

~$ wget -q -O- 'https://download.ceph.com/keys/release.asc' | sudo apt-key add -
# 用Ceph稳定版(如 cuttlefish 、 dumpling 、 emperor 、 firefly,nautilus 等等)替换掉 {ceph-stable-release}
~$ echo deb http://download.ceph.com/debian-{ceph-stable-release}/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list
~$ sudo apt-get update && sudo apt-get install ceph-deploy

下面把它转换成Ansible playbook的方式来安装.在Github上的Ceph有一个ceph-ansible,没有用过,它的 Star 将近 1k 了.

---
- name: 安装基础软件
  hosts: all
  become: yes
  # user: root 这里可以直接用root,但是关闭root远程登录后要使用sudo.
  tasks:
    # 参考文档 https://docs.ansible.com/ansible/latest/modules/command_module.html#command-module
    - name: 读取系统发行版本号
      command: lsb_release -sc
      register: result

    - name: 安装公钥匙
      apt_key:
        url: https://download.ceph.com/keys/release.asc
        state: present

    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_repository_module.html?highlight=add%20apt%20repository
    - name: ceph-deploy
      apt_repository:
        repo: deb  http://download.ceph.com/debian-nautilus {{ result.stdout }} main
        state: present
        filename: ceph

    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_key_module.html?highlight=apt%20key
    - name: 添加docker-ce的公钥
      apt_key:
        url: https://download.docker.com/linux/debian/gpg
        state: present

      # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_repository_module.html?highlight=add%20apt%20repository
    - name: docker-ce
      apt_repository:
        repo: deb [arch=amd64] https://download.docker.com/linux/debian {{ result.stdout }} stable
        state: present
        filename: docker-ce

    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_module.html
    - name: 更新并安装
      apt:
        name:
          ['ntp', 'ntpdate', 'ntp-doc', 'docker-ce', 'bridge-utils', 'ipvsadm']
        allow_unauthenticated: yes
        update_cache: yes

    # 参照文档 https://docs.ansible.com/ansible/latest/modules/lineinfile_module.html?highlight=sudoers
    # 如查使用ansible的sysctl模块,可以参照这里 https://docs.ansible.com/ansible/latest/modules/sysctl_module.html?highlight=sysctl
    - name: 更新并安装sysctl
      lineinfile:
        path: /etc/sysctl.d/80-k8s.conf
        create: yes
        line: '{{ item }}'
      with_items:
        - 'net.bridge.bridge-nf-call-ip6tables = 1'
        - 'net.bridge.bridge-nf-call-iptables = 1'
        - 'net.bridge.bridge-nf-call-arptables = 1'
        - 'net.ipv4.ip_forward = 1'

    - name: 更新sysctl
      command: sysctl --system

    - block:
        # 命名方式参考这里https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/
        # https://major.io/2015/08/21/understanding-systemds-predictable-network-device-names/
        - name: 使用旧式网卡命名方式
          file:
            src: /dev/null
            dest: /etc/systemd/network/99-default.link
            state: link

        # 好像上述改回成旧式命名的方法在debian里不成功.使用下面修改内核参数的方式.
        - name: 更新内核参数
          lineinfile:
            path: /etc/default/grub
            regexp: '^GRUB_CMDLINE_LINUX='
            line: 'GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0"'

        - name: 更新grub.cfg
          command: grub-mkconfig -o /boot/grub/grub.cfg

安装`Kubernetes`Master

~$ sudo kubeadm init --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers  --kubernetes-version v1.14.1 --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=192.168.99.100
[...]
Your`Kubernetes`control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user: # 注意需按照下述步骤进行.先安装对应的网络插件再加入其它的节点

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.99.100:6443 --token ejtj7f.oth6on2k6y0qcj2k \
    --discovery-token-ca-cert-hash sha256:d162721230250668a4296aca699867126314a9ecd2418f9c70110b6b02bd01de

# 继续安装网络插件.
~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml

默认安装集群是使用kube-proxy+iptables模式,需要手动修改为ipvs模式.使用kubectl -n kube-system edit cm kube-proxy打开 ConfigMap 文件,把mode=""替换成mode="ipvs,再把旧的 pod 删掉,kubectl -n kube-system delete pod kube-proxy-xxx,它会再生成一个新的 pod.

~$ kubectl -n kube-system logs kube-proxy-t27xd
I0514 06:33:30.681150       1 server_others.go:177] Using ipvs Proxier.  ---> 切换成ipvs模式.
W0514 06:33:30.738710       1 proxier.go:381] IPVS scheduler not specified, use rr by default
I0514 06:33:30.747818       1 server.go:555] Version: v1.14.1
[...]

#　查看ipvs的列表.
~$ sudo ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  172.17.0.1:32047 rr
TCP  192.168.99.100:32047 rr
TCP  10.0.2.15:32047 rr
TCP  10.96.0.1:443 rr
  -> 192.168.99.100:6443          Masq    1      3          0
[...]

使用Ansible批量加入 k8s 集群.

~$ cat hosts
[master]
192.168.99.100
[node1]
192.168.99.101
[node2]
192.168.99.102
[node3]
192.168.99.103
[node4]
192.168.99.104
[node]
192.168.99.101
192.168.99.102
192.168.99.103
192.168.99.104

~$ansible  -i hosts node -b  -m command -a "kubeadm join 192.168.99.100:6443 --token ejtj7f.oth6on2k6y0qcj2k --discovery-token-ca-cert-hash sha256:d162721230250668a4296aca699867126314a9ecd2418f9c70110b6b02bd01de"

查看主节点的状态

~$ kubectl get nodes
NAME         STATUS     ROLES    AGE   VERSION
k8s-master   NotReady   master   15h   v1.14.1
node1        NotReady   <none>   15h   v1.14.1
node2        NotReady   <none>   15h   v1.14.1
node3        NotReady   <none>   15h   v1.14.1
node4        NotReady   <none>   15h   v1.14.1

# 查看所有节点为什么是NotReady状态?
~$ kubectl get pods -n kube-system
NAME                                 READY   STATUS    RESTARTS   AGE
coredns-d5947d4b-kfhlp               0/1     Pending   0          15h
coredns-d5947d4b-sq95j               0/1     Pending   0          15h
etcd-k8s-master                      1/1     Running   2          15h
kube-apiserver-k8s-master            1/1     Running   2          15h
kube-controller-manager-k8s-master   1/1     Running   2          15h
kube-proxy-25vgp                     1/1     Running   2          15h
kube-proxy-75xjc                     1/1     Running   1          15h
kube-proxy-bvdh6                     1/1     Running   1          15h
kube-proxy-lzp8m                     1/1     Running   1          15h
kube-proxy-wnmwk                     1/1     Running   1          15h
kube-scheduler-k8s-master            1/1     Running   2          15h

# 查看coredns为什么Pending?
~$  kubectl describe pod coredns -n kube-system
[...]
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  10m (x49 over 81m)   default-scheduler  0/5 nodes are available: 5 node(s) had taints that the pod didn\'t tolerate.
  Warning  FailedScheduling  75s (x4 over 5m21s)  default-scheduler  0/5 nodes are available: 5 node(s) had taints that the pod didn\'t tolerate.

# 查看系统 journalctl
~$ sudo journalctl -u kubelet
# 发现是因为没有安装网络插件的原因.
~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml

使用`Rook`构建

Ceph Rook集成
Rook 的架构

安装`Ceph`

~$ git clone https://github.com/rook/rook
~$ cd rook/cluster/examples/kubernetes/ceph/
~$ kubectl create -f common.yaml
~$ kubectl create -f operator.yaml
~$ kubectl create -f cluster.yaml
~$ kubectl -n rook-ceph get pods
NAME                                  READY   STATUS    RESTARTS   AGE
rook-ceph-agent-f7ln5                 1/1     Running   0          5m36s
rook-ceph-agent-fzztf                 1/1     Running   0          5m36s
rook-ceph-agent-mgqk6                 1/1     Running   0          5m36s
rook-ceph-agent-qdbmh                 1/1     Running   0          5m36s
rook-ceph-agent-twsvp                 1/1     Running   0          5m36s
rook-ceph-operator-775cf575c5-8k44f   1/1     Running   1          6m30s
rook-discover-d4btd                   1/1     Running   0          5m36s
rook-discover-fbq9w                   1/1     Running   0          5m36s
rook-discover-gcksv                   1/1     Running   0          5m36s
rook-discover-hnbdj                   1/1     Running   0          5m36s
rook-discover-j5x5h                   1/1     Running   0          5m36s

拆除`ROOK`

~$ cat remove-nodes-rooks-containers.sh
for i in `seq 1 4`; do
  for n in `ansible -i hosts node$i -b -m command -a "docker ps -a" | awk 'NR>2 {print $1}'`;do
    #ansible -i hosts node$i -b -m command -a "docker stop $n ; docker rm $n";
    ansible -i hosts node$i -b -m command -a "docker rm $n";
  done;
done

~$ cat  remove-rook-cluster-data.sh
ansible -i hosts all -b -m file -a "path=/var/lib/rook  state=absent"
ansible -i hosts all -b -m file -a "path=/etc/kubernetes  state=absent"
ansible -i hosts all -b -m file -a "path=/var/lib/kubelet state=absent"

错误

~$ kubectl -n rook-ceph get pod
NAME                               READY   STATUS                  RESTARTS   AGE
rook-ceph-mon-a-f799d9cf6-xrg8f    0/1     Init:CrashLoopBackOff   6          8m46s
rook-ceph-mon-d-5dd7b4d56f-wwg8n   0/1     Init:CrashLoopBackOff   6          7m1s
rook-ceph-mon-f-7977bd98c9-9b6h4   0/1     Init:CrashLoopBackOff   5          5m19s

~$ kubectl -n rook-ceph  describe pod rook-ceph-mon-a
[...]
Events:
  Type     Reason     Age                     From                 Message
  ----     ------     ----                    ----                 -------
  Normal   Scheduled  9m15s                   default-scheduler    Successfully assigned rook-ceph/rook-ceph-mon-a-f799d9cf6-xrg8f to k8s-master
  Normal   Pulled     7m20s (x5 over 9m4s)    kubelet, k8s-master  Container image "rook/ceph:v0.9.3" already present on machine
  Normal   Created    7m19s (x5 over 9m2s)    kubelet, k8s-master  Created container config-init
  Normal   Started    7m18s (x5 over 8m59s)   kubelet, k8s-master  Started container config-init
  Warning  BackOff    3m52s (x26 over 8m52s)  kubelet, k8s-master  Back-off restarting failed container

~$ kubectl -n rook-ceph  describe pod rook-ceph-mon-d
[...]
Events:
  Type     Reason     Age                     From               Message
  ----     ------     ----                    ----               -------
  Normal   Scheduled  8m2s                    default-scheduler  Successfully assigned rook-ceph/rook-ceph-mon-d-5dd7b4d56f-wwg8n to node1
  Normal   Pulled     6m15s (x5 over 7m45s)   kubelet, node1     Container image "rook/ceph:v0.9.3" already present on machine
  Normal   Created    6m15s (x5 over 7m45s)   kubelet, node1     Created container config-init
  Normal   Started    6m14s (x5 over 7m45s)   kubelet, node1     Started container config-init
  Warning  BackOff    2m41s (x26 over 7m43s)  kubelet, node1     Back-off restarting failed container

谢谢支持

微信二维码:

`PaaS`概述

PaaS(Platform as a service),平台即服务,指将软件研发的平台(或业务基础平台)作为一种服务,以SaaS的模式提交给用户.PaaS是云计算服务的其中一种模式,云计算是一种按使用量付费的模式的服务,类似一种租赁服务,服务可以是基础设施计算资源(IaaS),平台(PaaS),软件(SaaS).租用IT资源的方式来实现业务需要,如同水力、电力资源一样,计算、存储、网络将成为企业IT运行的一种被使用的资源,无需自己建设,可按需获得.PaaS的实质是将互联网的资源服务化为可编程接口,为第三方开发者提供有商业价值的资源和服务平台.简而言之,IaaS就是卖硬件及计算资源,PaaS就是卖开发、运行环境,SaaS就是卖软件.

类型	说明	比喻	例子
IaaS:Infrastructure-as-a-Service(基础设施即服务)	提供的服务是计算基础设施	地皮,需要自己盖房子	Amazon EC2(亚马逊弹性云计算),阿里云
PaaS: Platform-as-a-Service(平台即服务)	提供的服务是软件研发的平台或业务基础平台	商品房,需要自己装修	GAE(谷歌开发者平台),heroku
SaaS: Software-as-a-Service(软件即服务)	提供的服务是运行在云计算基础设施上的应用程序	酒店套房,可以直接入住	谷歌的 Gmail 邮箱

`Kubernetes`概述

What is Kubernetes?
Kubernetes是Google开源的容器集群管理系统.它构建Docker技术之上,为容器化的应用提供资源调度、部署运行、服务发现、扩容缩容等整一套功能,本质上可看作是基于容器技术的Micro-PaaS平台,即第三代PaaS的代表性项目.
Kubernetes架构图

`Kubernetes`的基本概念

`Pod`

Pod是若干个相关容器的组合,是一个逻辑概念,Pod包含的容器运行在同一个宿主机上,这些容器使用相同的网络命名空间,IP地址和端口,相互之间能通过localhost来发现和通信,共享一块存储卷空间.在Kubernetes中创建、调度和管理的最小单位是Pod.一个Pod一般只放一个业务容器和一个用于统一网络管理的网络容器.

`Replication Controller`

Replication Controller是用来控制管理Pod副本(Replica,或者称实例),Replication Controller确保任何时候Kubernetes集群中有指定数量的Pod副本在运行,如果少于指定数量的Pod副本,Replication Controller会启动新的Pod副本,反之会杀死多余的以保证数量不变.另外Replication Controller是弹性伸缩、滚动升级的实现核心.

`Deployment`

Deployment是一种更加简单的更新RC和Pod的机制.通过在Deployment中描述期望的集群状态,Deployment Controller会将现在的集群状态在一个可控的速度下渐渐更新成期望的集群状态.Deployment的主要的职责同样是为了保证Pod的数量和健康,而且绝大多数的功能与Replication Controller完全一样,因些可以被看作新一代的RC的超集.它的特性有:事件和状态查看,回滚,版本记录,暂停和启动,多种升级方案:Recreate,RollingUpdate.

`Job`

从程序的运行形态上来区分,我们可以将Pod分成两类:长期运行服务(jboss,mysql,nginx等)和一次性任务(如并行数据计算,测试).RC创建的Pod是长时运行的服务,而Job创建的Pod的都是一次性任务.

`StatefulSet`

StatefulSet是在与有状态的应用及分布式系统一起使用的.StatefulSet使用起来相对复杂,当应用具有以下特点时才使用它.
- 有唯一的稳定网络标识符需求.
- 有稳定性,持久化数据存储需求.
- 有序的部署和扩展需求.
- 有序的删除和终止需求.
- 有序的自动滚动更新需求.

`Service`

Kubernetes Service文档
Service是真实应用服务的抽象,定义了Pod的逻辑集合和访问这个Pod集合的策略,Service将代理Pod对外表现为一个单一访问接口,外部不需要了解后端Pod如何运行,这给扩展或维护带来很大的好处,提供了一套简化的服务代理和发现机制.Service共有四种类型:
- ClusterIP: 通过集群内部的IP地址暴露服务,此地址仅在集群内部可达,而无法被集群外部的客户端访问,此为默认的Service类型.
- NodePort: 这种类型建立在ClusterIP类型之止,其在每个节点上的IP地址的某静态端口(NodePort)暴露服务,因此,它依然会为Service分配集群IP地址,并将此作为NodePort的路由目标.
- LoadBalancer: 这种类型建构在NodePort类型之上,其通过cloud provider提供的负载均衡器将服务暴露到集群外部.因此LoadBalancer一样具有NodePort和ClusterIP.(目前只有云服务商才可支持,如果是用VirtualBox做实验,只能是ClusterIP或NodePort)
- ExternalName: 其通过将Service映射到由externalName字段的内容指定的主机名来暴露服务,此主机名需要被DNS服务解析到CNAME类型的记录.

`Ingress`

Ingrees资源,它实现的是”HTTP(S)负载均衡器“,它是k8s API的标准资源类型之一,它其实就是基于DNS名称或URL路径把请求转发到指定的Service资源的规则,用于将集群外部的请求流量转发到集群内部完成服务发布,Ingress资源自身并不能进行流量穿透,它仅是一组路由规则的集合. 不同于Deployment控制器等,Ingress控制器并不直接运行为kube-controller-manager的一部分,它是k8s集群的一个重要附件,类似于CoreDNS,需要在集群上单独部署.

`Label`

Label是用于区分Pod、Service、Replication Controller的Key/Value键值对,实际上Kubernetes中的任意API对象都可以通过Label进行标识.每个API对象可以有多个Label,但是每个Label的Key只能对应一个Value.Label是Service和Replication Controller运行的基础,它们都通过Label来关联Pod,相比于强绑定模型,这是一种非常好的松耦合关系.

`Node`

Kubernets属于主从的分布式集群架构,Kubernets Node(简称为Node,早期版本叫做Minion)运行并管理容器.Node作为Kubernetes的操作单元,将用来分配给Pod(或者说容器)进行绑定,Pod最终运行在Node上,Node可以认为是Pod的宿主机.

`Kubernetes`架构

`Master`节点

Master是k8s集群的大脑,运行服务有:
- kube-apiserver
- kube-scheduler
- kube-controller-manager
- etcd,Pod 网络(如 Flannel,Canal)

`API Server(kube-apiserver)`

Api Server是k8s集群的前端接口,各种工具可以通过它管理Cluster的各种资源.

`Scheduler(kube-scheduler)`

Scheduler负责决定将Pod放在那个Node上运行.它调试时会充分考虑Cluster 的拓扑结构,找到一个最优方案.

`Controller Manager(kube-controller-manager)`

Controller Manager负责管理Cluster 各种资源.保证资源的处于预期的状态.它是由多种controller 组成的.

`etcd`

etcd负责保存k8s的配置信息和各种资源的状态信息.当数据发生变化时,etcd会快速地通知k8s相关组件.

`Pod`网络

-Pod要能够相互通信,k8s必须部署Pod网络,flannel 是其中一个可选的方案.

`Node`节点

Node是Pod运行的地方,k8s 支持Docker,rkt等容器的Runtime.它上面运行的组件有kubelet,kube-proxy,Pod网络.

`kubelet`

kubelet是Node的agent,当Scheduler确定在某个Node上运行Pod后,会将Pod的具体配置信息(image,volume)发给该节点的kubelet.kubelet根据这该信息创建各运行容器,并向master报告状态.

`kube-proxy`

service在逻辑上代表了后端的多个Pod,外界通过service访问Pod.service 接收到的请求是如何转发到相应的Pod.如有多个副本,kube-proxy会实现负载均衡.

`Secret & ConfigMap`

Secret可以为Pod提供密码,Token,私钥等敏感数据;对于一些非敏感的数据,比如应用的配置信息,可以使用ConfigMap.

插件

网络相关

ACI provides integrated container networking and network security with Cisco ACI.
Calico is a secure L3 networking and network policy provider.
Canal unites Flannel and Calico, providing networking and network policy.
Cilium is a L3 network and network policy plugin that can enforce HTTP/API/L7 policies transparently. Both routing and overlay/encapsulation mode are supported.
CNI-Genie enables Kubernetes to seamlessly connect to a choice of CNI plugins, such as Calico, Canal, Flannel, Romana, or Weave.
Contiv provides configurable networking (native L3 using BGP, overlay using vxlan, classic L2, and Cisco-SDN/ACI) for various use cases and a rich policy framework. Contiv project is fully open sourced. The installer provides both kubeadm and non-kubeadm based installation options.
Contrail, based on Tungsten Fabric, is a open source, multi-cloud network virtualization and policy management platform. Contrail and Tungsten Fabric are integrated with orchestration systems such as Kubernetes, OpenShift, OpenStack and Mesos, and provide isolation modes for virtual machines, containers/pods and bare metal workloads.
Flannel is an overlay network provider that can be used with Kubernetes.
Knitter is a network solution supporting multiple networking in Kubernetes.
Multus is a Multi plugin for multiple network support in Kubernetes to support all CNI plugins (e.g. Calico, Cilium, Contiv, Flannel), in addition to SRIOV, DPDK, OVS-DPDK and VPP based workloads in Kubernetes.
NSX-T Container Plug-in (NCP) provides integration between VMware NSX-T and container orchestrators such as Kubernetes, as well as integration between NSX-T and \ container-based CaaS/PaaS platforms such as Pivotal Container Service (PKS) and OpenShift.
Nuage is an SDN platform that provides policy-based networking between Kubernetes Pods and non-Kubernetes environments with visibility and security monitoring.
Romana is a Layer 3 networking solution forPodnetworks that also supports the NetworkPolicy API. Kubeadm add-on installation details available here.
Weave Net provides networking and network policy, will carry on working on both sides of a network partition, and does not require an external database.

服务发现

CoreDNS is a flexible, extensible DNS server which can be installed as the in-cluster DNS for pods.

可视化面板

Dashboard is a dashboard web interface for Kubernetes.
Weave Scope is a tool for graphically visualizing your containers, pods, services etc. Use it in conjunction with a Weave Cloud account or host the UI yourself.

安装`Minikube`

链接:

使用Minikube 是运行 Kubernetes 集群最简单,最快捷的途路径.Minikube是一个构建单节点集群的工具,对于测试Kubernetes和本地开发应用都非常有用,这里在Debian下安装,它默认会要使用到VirtualBox虚拟机为驱动,也可以安装kvm2为驱动.Minikube的参考文档docker-machine的参数.
第一次使用minikube start,它会在创建一个~/.minikube目录,会下载minikube-vxxx.iso到~/.minikube/cache/下面,如果下载十分缓慢,可以手动https://storage.googleapis.com/minikube/iso/minikube-v0.35.0.iso下载复制到目录下.

~$ curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.35.0/minikube-linux-amd64 && chmod +x minikube && sudo cp minikube /usr/local/bin/ && rm minikube

# 也可以从这里https://storage.googleapis.com/minikube/releases/v0.35.0/minikube-linux-amd64下载.
# 下载kvm2驱动
~$ wget https://github.com/kubernetes/minikube/releases/download/v0.35.0/docker-machine-driver-kvm2
~$ chmod +x docker-machine-driver-kvm2 && mv docker-machine-driver-kvm2 /usr/local/bin && rm minikube

启动一个Minikube虚拟机,如果直接minikube start就会默认使用VirtualBox启动.网络部分使用default.--kvm-network的参数,来源于virsh net-list.minikube-net是一个隔离的网络,也不是说不能联网的,如果需要连网,要与本机的网卡或者网络做桥接或NAT.
k8s.gcr.io在国内是无法直接访问的,所以会造成minikube无法拖取相关的镜像,最终导致minikube无法正常使用.在此有几个方法可以变通一下,最简单的方法是使用代理:
1
--docker-env HTTP_PROXY=<ip:port> --docker-env HTTPS_PROXY=<ip:port> --docker-env NO_PROXY=127.0.0.1,localhost`
如果服务器可以访问外网,则可在docker daemon的启动参数(/etc/sysconfig/docker)中OPTIONS
加上--insecure-registry k8s.gcr.io

~$  minikube start --vm-driver=kvm2 --kvm-network minikube-net --registry-mirror=https://registry.docker-cn.com --kubernetes-version v1.14.0

# 从阿里下载过来,最好是可以使用代理直接从k8s.gcr.io下载.
~$ docker pull  registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.14.0
~$ docker pull  registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.14.0
~$ docker pull  registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.14.0
~$ docker pull  registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.14.0
~$ docker pull  registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.3.1
~$ docker pull  registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.10
~$ docker pull  registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1

# 使用脚本批量把它们换tag,也可以把它推送到自己公司的私有仓库中去,测试使用重写TAG的方法好像不行,还是要用代理才能正常下载.
~$ docker images | grep "aliyuncs.com" | awk '{split($1,a,"/"); print "docker tag " $1":"$2 " k8s.gcr.io/"a[3]":"$2}'
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.14.0 k8s.gcr.io/kube-proxy:v1.14.0
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.14.0 k8s.gcr.io/kube-apiserver:v1.14.0
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.14.0 k8s.gcr.io/kube-scheduler:v1.14.0
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.14.0 k8s.gcr.io/kube-controller-manager:v1.14.0
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.10 k8s.gcr.io/etcd:3.3.10
~$ docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1

# 再用手动初始化这些image到容器
~$ sudo kubeadm init --kubernetes-version=v1.14.0

经过测试直接使用kubeadm也可以直接拉取其它的国内镜像进行安装,使用代理也可以安装,考虑到找到一个稳定可靠的代理还是有一些难度.因为minikube start --vm-driver=kvm2是直接在创建一个虚拟机,并通过sudo kubeadm config images pull --config /var/lib/kubeadm.yaml在虚拟机拉取相应的docker镜像组件到本地部署各种k8s的服务.它默认是使用https://k8s.gcr.io/v2/这域名去拉取镜像,
1
~$ minikube start --vm-driver=kvm2 --kvm-network minikube-net --registry-mirror=https://registry.docker-cn.com

发现参数--registry-mirror并没有起作用,还是会报错如下:

 Unable to pull images, which may be OK: running cmd: sudo kubeadm config images pull --config /var/lib/kubeadm.yaml: command failed: sudo kubeadm config images pull --config /var/lib/kubeadm.yaml
stdout:
stderr: failed to pull image "k8s.gcr.io/kube-apiserver:v1.14.0": output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

下载kubectl到本地开发机器(控制端)

~$ curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
~$ chmod +x kubectl && sudo mv kubectl /usr/local/bin

# 加入自动补全功能
~$ echo "source <(kubectl completion bash)" >> ~/.bashrc
~$ kubectl get nodes
NAME       STATUS   ROLES    AGE   VERSION
minikube   Ready    <none>   94m   v1.13.4

源码编译`Minikube`

下载最新GO语言编译器

1
2
3

~$ wget -c https://golang.google.cn/doc/install?download=go1.12.4.linux-amd64.tar.gz
~$ sudo tar xvf go1.12.4.linux-amd64.tar.gz -C /opt/
~$ export PATH=/opt/go/bin:$PATH

下载源码,要先创建/opt/go/src/k8s.io目录,在该目录下克隆代码.并且修改镜像地址

~$ sudo mkdir /opt/go/src/k8s.io && cd /opt/go/src/k8s.io &&   git clone https://github.com/kubernetes/minikube.git
~$ cd minikube && for item in `grep  -l "k8s.gcr.io" -r *`;do sed -i "s#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" $item  ;done
~$ make
~$ sudo cp out/minikube-linux-amd64 /usr/local/bin/minikube

综上所述,因为使用docker tag还是有问题,源码构建的Minikube可以完美解决墙的问题,如果去网上下载第三方的Minikube二进制怕有夹带私货的问题.

错误处理

⌛  Waiting for pods: apiserver proxy💣  Error restarting cluster: wait: waiting for k8s-app=kube-proxy: timed out waiting for the condition

😿  Sorry that minikube crashed. If this was unexpected, we would love to hear from you:
👉  https://github.com/kubernetes/minikube/issues/new

手动布署安装`Kubernetes`组件

Links:

`Debian,Ubuntu`发行版安装`kubeadm`

kubeadm: the command to bootstrap the cluster.
kubelet: the component that runs on all of the machines in your cluster and does things like starting pods and containers.
kubectl: the command line util to talk to your cluster.
中文
- kubeadm: 用来初始化集群的指令.
- kubelet: 在集群中的每个节点上用来启动Pod和container等.
- kubectl: 用来与集群通信的命令行工具.
这三个组件的安装时候要注意它们之间的版本兼容性问题.

官方软件仓库,在国内不能使用

~$ apt-get update && apt-get install -y apt-transport-https curl dirmngr
~$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
# 或者用如下的方式安装公钥,需要依赖安装dirmngr
~$ sudo apt-key  adv --keyserver  keyserver.ubuntu.com --recv-keys  6A030B21BA07F4FB
~$ sudo bash -c "cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
    deb https://apt.kubernetes.io/ kubernetes-xenial main
    EOF"
# 经测试 apt.kubernetes.io 重定向到k8s.io不能访问,可以使用国内镜像 http://mirrors.ustc.edu.cn/kubernetes/
~$ apt-get update
~$ apt-get install -y kubelet kubeadm kubectl
~$ apt-mark hold kubelet kubeadm kubectl

阿里云镜像仓库

~$ apt-get update && apt-get install -y apt-transport-https
~$ curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
~$ sudo bash -c "cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF"
~$ sudo apt-get update && sudo apt-get install kubelet kubeadm kubectl -y
~$ sudo apt-mark hold kubelet kubeadm kubectl

# 也可以直接使用curl下载二进制 ,要用代理.
~$ curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl

查询当前版本需要那些docker image.

~$ kubeadm config images list --kubernetes-version v1.14.0
k8s.gcr.io/kube-apiserver:v1.14.0
k8s.gcr.io/kube-controller-manager:v1.14.0
k8s.gcr.io/kube-scheduler:v1.14.0
k8s.gcr.io/kube-proxy:v1.14.0
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.10
k8s.gcr.io/coredns:1.3.1

安装`Master`节点

网络模型

目前Kubernetes支持多种网络方案,如: Flannel,Canal,Weave Net,Calico等.它都是实现了CNI的规范.
下面以安装Canal为示例.其它各安装参考这里的 Installing aPodnetwork add-on 部分,中文文档,在kubeadm init是必须要指定一个网络,不然会出现其它问题.根据上述链接指导如:
- Calico 网络模型:kubeadm init --pod-network-cidr=192.168.0.0/16,它只工在amd64,arm64,ppc64le三个平台.
- Flannel 网络模型:kubeadm init --pod-network-cidr=10.244.0.0/16,并修改内核参数
  sysctl net.bridge.bridge-nf-call-iptables=1,工作在Linux的amd64,arm,arm64,ppc64le,s390x

~$ kubectl apply -f \
https://docs.projectcalico.org/v3.6/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
[...]
serviceaccount/calico-kube-controllers created

安装Master节点,注意在国内必须指定--image-repository,默认的k8s.gcr.io是不能直接访问的,还有--kubernetes-version必须与kubelet的组件版本匹配.

~$ sudo kubeadm init --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers  --kubernetes-version v1.14.0 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[...]
# 安装成功后,注意下面提示的操作项流程依次进行.
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a`Pod`network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.18.127.186:6443 --token z8r97j.3ovdfddb6df9lnq7 \
    --discovery-token-ca-cert-hash sha256:07767a67fa6c38feda7471ee5e1a15a0a9c417cfdf6cf457ff577297f22d9415

根据kubeadm init的参数,安装Flannel网络插件

~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created

按照上述安装成功后的提示,配置Master节点信息.并安装网络插件请参考这里.

~$ mkdir -p $HOME/.kube
~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
~$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

# 查看token
~$ kubeadm token list
TOKEN                     TTL       EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
5smm64.9zpyhaqxghohh6b2   23h       2019-04-20T14:45:14+08:00   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token

修改`Kubelet`的启动参数

kubelet组件是通过systemctl来管理的,因此可以在每个节点里的/etc/systemed/system找到它的配置文件.

~# cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

安装集群节点

在另一个机器里安装上述的三个组件,就可以运行下面命令加入k8s集群管理了.

~$ sudo kubeadm join 172.18.127.186:6443 --token z8r97j.3ovdfddb6df9lnq7     --discovery-token-ca-cert-hash sha256:07767a67fa6c38feda7471ee5e1a15a0a9c417cfdf6cf457ff577297f22d9415
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

在 Master 节点上查看集群节点数.

~$ kubectl  get node
NAME             STATUS   ROLES    AGE    VERSION
aliyun-machine   Ready    master   110m   v1.14.0
dig001           Ready    <none>   84m    v1.14.0
fe001            Ready    <none>   2m8s   v1.14.0

查看集群的要完整架构,Master 上也可以运行应用,即 Master 同时也是一个 Node.

~$ kubectl get pod --all-namespaces -o wide
NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE     IP               NODE         NOMINATED NODE   READINESS GATES
kube-system   coredns-d5947d4b-sr5zt               1/1     Running   0          6m3s    10.244.0.6       k8s-master   <none>           <none>
kube-system   coredns-d5947d4b-tznh2               1/1     Running   0          6m3s    10.244.0.5       k8s-master   <none>           <none>
kube-system   etcd-k8s-master                      1/1     Running   0          5m11s   172.18.127.186   k8s-master   <none>           <none>
kube-system   kube-apiserver-k8s-master            1/1     Running   0          5m15s   172.18.127.186   k8s-master   <none>           <none>
kube-system   kube-controller-manager-k8s-master   1/1     Running   0          5m13s   172.18.127.186   k8s-master   <none>           <none>
kube-system   kube-flannel-ds-amd64-9d965          1/1     Running   0          5m17s   172.18.127.186   k8s-master   <none>           <none>
kube-system   kube-flannel-ds-amd64-c8dkh          1/1     Running   0          38s     172.18.192.76    dig001       <none>           <none>
kube-system   kube-flannel-ds-amd64-kswj2          1/1     Running   0          52s     172.18.253.222   fe001        <none>           <none>
kube-system   kube-proxy-5g9vp                     1/1     Running   0          38s     172.18.192.76    dig001       <none>           <none>
kube-system   kube-proxy-cqzfl                     1/1     Running   0          52s     172.18.253.222   fe001        <none>           <none>
kube-system   kube-proxy-pjbbg                     1/1     Running   0          6m3s    172.18.127.186   k8s-master   <none>           <none>
kube-system   kube-scheduler-k8s-master            1/1     Running   0          5m19s   172.18.127.186   k8s-master   <none>           <none>

获取Pod的完整信息

1 2	~$ kubectl get pod <podname> --output json # 用JSON格式显示Pod的完整信息 ~$ kubectl get pod <podname> --output yaml # 用YAML格式显示Pod的完整信息

拆除`k8s`集群

1
2
3

~$ kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
~$ kubectl delete node <node name>
~$ kubeadm reset

安装`HelloWorld`共享`Pod`数据

Communicate Between Containers in the Same Pod Using a Shared Volume

apiVersion: v1 #  for k8s versions before 1.9.0 use apps/v1beta2  and before 1.8.0 use extensions/v1beta1
kind: Pod
metadata:
  name: hello-world
spec:
  restartPolicy: Never
  containers:
  - name: write
    image: debian:latest
    volumeMounts:
      - name: data
        mountPath: /data
    command: ["bash","-c","echo \"Hello World\" >> /data/hello"]
  - name: read
    image: debian:latest
    volumeMounts:
      - name: data
        mountPath: /data
    command: ["bash","-c","sleep 10; cat /data/hello"]

  volumes:
  - name: data
    hostPath:
      path: /tmp

1 2	~$ kubectl apply -f hello-world.yaml

通过`Github`安装`Kubernetes`

`Etcd`

CoreOS Etcd
从Github 下载

1
2
3

~$ wget -c https://github.com/etcd-io/etcd/releases/download/v3.3.12/etcd-v3.3.12-linux-amd64.tar.gz
~$ tar zxvf etcd-v3.3.12-linux-amd64.tar.gz
~$ cd etcd-v3.3.12-linux-amd64 && sudo cp {etcd,etcdctl} /usr/local/bin

运行Etcd

1 2	~$ etcd -name etcd --data-dir /var/lib/etcd -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \ -advertise-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 >> /var/log/etcd.log 2>&1 &

查询它的建康状态

1 2	~$ etcdctl -C http://127.0.0.1:4001 cluster-healthmember 8e9e05c52164694d is healthy: got healthy result from http://0.0.0.0:2379 cluster is healthy

`Kubernetes`发布包安装

Kubernetes

通过github 下载最新的版本.

~$ wget -c https://github.com/kubernetes/kubernetes/releases/download/v1.14.0/kubernetes.tar.gz
~$ tar xvf kubernetes.tar.gz && cd kubernetes
~$ tree -L 1
.
├── client
├── cluster
├── docs
├── hack
├── LICENSES
├── README.md
├── server
└── version

$ cat server/README
Server binary tarballs are no longer included in the Kubernetes final tarball.

Run cluster/get-kube-binaries.sh to download client and server binaries.

根据上面的README提示,服务端的组件没有包含在上面的压缩包,而是要通过运行cluster/get-kube-binaries.sh从https://dl.k8s.io/v1.14.0下载.但是dl.k8s.io在国内是不能直接访问的.

安装`Minio`服务(单节点服务)

链接:
MinIO是一个高性能的Kubernetes下原生的对像存储,它的API同样兼容Amazon S3云存储服务.下面使用kubectl直接按装Minio是参照Deploy MinIO on Kubernetes,下面也可以使用Helm安装它.
快速运行单节点服务, 下面运行成功,后可以打开http://127.0.0.1:9000,使用minioadmin:minioadmin登录到控制台.
1
2
~$ podman run -p 9000:9000 -p 9001:9001 \
quay.io/minio/minio server /data --console-address ":9001"

或者绑定一个本机目录minio_data到容器内去.

1 2	~$ podman run -v `pwd`/minio_data:/data -p 9000:9000 -p 9001:9001 \ quay.io/minio/minio server /data --console-address ":9001"

`MinIO`服务端

创建`PV (Persistent Volume)`

在Kubernetes环境中,可以使用MinIO Kubernetes Operator
下面是一个资源描述文件,创建一个10G大小的,本地类型的PV. PV可以理解成k8s集群中的某个网络存储中对应的一块存储,它与Voleme很类似.PV只能是网络存储,不属于任何的Node,但可以在每个Node上访问它.PV并不是定义在Pod上的,而是独立于Pod之外的定义.PV目前支持的类型包括:
- GCEPersistentDisk,
- AWSElasticBlockStore,
- AzureFile,FC(Fibre Channel),
- NFS,
- iSCSI,
- RBD(Rados Block Device),
- CephFS,
- GlusterFS,
- HostPath(仅供单机测试)等等.

~$ cat pv.yaml

kind: PersistentVolume
apiVersion: v1
metadata:
  name: task-pv-volume
  labels:
    type: local
spec:
  storageClassName: standard
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/data"

~$ kubectl create -f pv.yaml

安装`Minio PVC (Persistent Volume Claim)`

PVC指定所需要的存储大小,然后k8s会选择满足条件的PV进行绑定,如果PVC创建之后没绑定到PV,就会出现Pending的错误,所以要按照PV->PVC->Deployment->Service这个顺序来试验.PVC的几种状态有:
- Available: 空闲状态
- Bound: 已经绑定到某个PVC上.
- Released: 对应的PVC已经删除,但资源还没有被集群收回.
- Failed: PV 自动回收失败.

1 2	~$ kubectl create -f https://github.com/minio/minio/blob/master/docs/orchestration/kubernetes/minio-standalone-pvc.yaml?raw=true persistentvolumeclaim/minio-pv-claim created

也可以把上面这个链接的文件下载到本地,修改

~$ cat minio-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  # This name uniquely identifies the PVC. This is used in deployment.
  name: minio-pv-claim
spec:
  # Read more about access modes here: http://kubernetes.io/docs/user-guide/persistent-volumes/#access-modes
  storageClassName: standard
  accessModes:
    # The volume is mounted as read-write by a single node
    - ReadWriteOnce
  resources:
    # This is the request for storage. Should be available in the cluster.
    requests:
      storage: 10Gi

查看系统中的PVC状态,下面显示状态是Pending,使用describe查看它的详情.

~$ kubectl get pvc  --namespace default
NAME             STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
minio-pv-claim   Pending
~$ kubectl get pvc  --namespace default
NAME             STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
minio-pv-claim   Pending                                                     2m22s
lcy@k8s-master:~$ kubectl describe pvc minio-pv-claim
Name:          minio-pv-claim
Namespace:     default
StorageClass:
Status:        Pending
Volume:
Labels:        <none>
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason         Age                 From                         Message
  ----       ------         ----                ----                         -------
  Normal     FailedBinding  4s (x14 over 3m2s)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set
Mounted By:  minio-6d4d48db87-wxr4d

根据上面Events显示错误如下:

1	FailedBinding 4s (x14 over 3m2s) persistentvolume-controller no persistent volumes available for this claim and no storage class is set`

这就是它为什么Pending的原因.继续下一步,处理这些依赖的错误.

安装`Minio Deployment`

~$ kubectl create -f https://github.com/minio/minio/blob/master/docs/orchestration/kubernetes/minio-standalone-deployment.yaml?raw=true
deployment.extensions/minio created

~$ cat minio-standalone-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  # This name uniquely identifies the Deployment
  name: minio
spec:
  strategy:
    # Specifies the strategy used to replace old Pods by new ones
    # Refer: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
    type: Recreate
  template:
    metadata:
      labels:
        # This label is used as a selector in Service definition
        app: minio
    spec:
      # Volumes used by this deployment
      volumes:
      - name: data
        # This volume is based on PVC
        persistentVolumeClaim:
          # Name of the PVC created earlier
          claimName: minio-pv-claim
      containers:
      - name: minio
        # Volume mounts for this container
        volumeMounts:
        # Volume 'data' is mounted to path '/data'
        - name: data
          mountPath: "/data"
        # Pulls the lastest Minio image from Docker Hub
        image: minio/minio:RELEASE.2019-04-18T21-44-59Z
        args:
        - server
        - /data
        env:
        # MinIO access key and secret key
        - name: MINIO_ACCESS_KEY
          value: "minio"
        - name: MINIO_SECRET_KEY
          value: "minio123"
        ports:
        - containerPort: 9000
        # Readiness probe detects situations when MinIO server instance
        # is not ready to accept traffic. Kubernetes doesn't forward
        # traffic to the pod while readiness checks fail.
        readinessProbe:
          httpGet:
            path: /minio/health/ready
            port: 9000
          initialDelaySeconds: 120
          periodSeconds: 20
        # Liveness probe detects situations where MinIO server instance
        # is not working properly and needs restart. Kubernetes automatically
        # restarts the pods if liveness checks fail.
        livenessProbe:
          httpGet:
            path: /minio/health/live
            port: 9000
          initialDelaySeconds: 120
          periodSeconds: 20
~$ kubectl get deployment  --namespace default
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
minio   0/1     1            0           82s

安装`Minio Service`

~$ kubectl create -f https://github.com/minio/minio/blob/master/docs/orchestration/kubernetes/minio-standalone-service.yaml?raw=true
service/minio-service created

~$ cat minio-standalone-service.yaml
apiVersion: v1
kind: Service
metadata:
  # This name uniquely identifies the service
  name: minio-service
spec:
  type: LoadBalancer
  ports:
    - port: 9000
      targetPort: 9000
      protocol: TCP
  selector:
    # Looks for labels `app:minio` in the namespace and applies the spec
    app: minio

~$ kubectl get svc minio-service
NAME            TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
minio-service   LoadBalancer   10.100.57.156   <pending>     9000:32552/TCP   10m
# 查看为什么会Pending
~$ kubectl describe pod --namespace default -l app=minio
Name:               minio-756cb7dff7-mcm6m
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               fe001/172.18.253.222
Start Time:         Fri, 19 Apr 2019 15:20:30 +0800
Labels:             app=minio
                    pod-template-hash=756cb7dff7
Annotations:        <none>
Status:             Running
IP:                 10.244.1.4
Controlled By:      ReplicaSet/minio-756cb7dff7
Containers:
  minio:
    Container ID:  docker://119535fa5ab172b5b2155c650dc51c2d12b3c02b1e28ab9e8301eb318ab969a7
    Image:         minio/minio:RELEASE.2019-04-18T21-44-59Z
    Image ID:      docker-pullable://minio/minio@sha256:a26e089732b85f8c312ff6346498acec763033b1ac85e74fc897f667939ea2aa
    Port:          9000/TCP
    Host Port:     0/TCP
    Args:
      server
      /data
    State:          Running
      Started:      Fri, 19 Apr 2019 15:20:51 +0800
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:9000/minio/health/live delay=120s timeout=1s period=20s #success=1 #failure=3
    Readiness:      http-get http://:9000/minio/health/ready delay=120s timeout=1s period=20s #success=1 #failure=3
    Environment:
      MINIO_ACCESS_KEY:  minio
      MINIO_SECRET_KEY:  minio123
    Mounts:
      /data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-2vsh9 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  minio-pv-claim
    ReadOnly:   false
  default-token-2vsh9:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-2vsh9
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  5m10s (x7 over 12m)  default-scheduler  0/3 nodes are available: 3 node(s) had taints that the pod didn\'t tolerate.
  Normal   Scheduled         5m8s                 default-scheduler  Successfully assigned default/minio-756cb7dff7-mcm6m to fe001
  Normal   Pulling           5m7s                 kubelet, fe001     Pulling image "minio/minio:RELEASE.2019-04-18T21-44-59Z"
  Normal   Pulled            4m47s                kubelet, fe001     Successfully pulled image "minio/minio:RELEASE.2019-04-18T21-44-59Z"
  Normal   Created           4m47s                kubelet, fe001     Created container minio
  Normal   Started           4m47s                kubelet, fe001     Started container minio

如上所示,Pending原因是因为 Warning FailedScheduling 5m10s (x7 over 12m) default-scheduler 0/3 nodes are available: 3 node(s) had taints that the pod didn\'t tolerate.

~$ kubectl get pod --namespace default -l app=minio
NAME                     READY   STATUS    RESTARTS   AGE
minio-756cb7dff7-k2sdk   1/1     Running   0          15m
# 在运行的容器中远程执行命令.下面的双横杠代表着kubectl命令项的结束.
~$ kubectl exec minio-756cb7dff7-k2sdk -- ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=1 ttl=41 time=25.684 ms
# 在容器中运行shell
~$ kubectl exec -it minio-756cb7dff7-k2sdk sh
/ #

通过`Ingress`暴露服务

向集群外部的客户端公开服务的有使用LoadBalancer和Ingress两种方法.每个LoadBalancer服务都需要自己的负载均衡器,以及独有的公有IP的地址,而Ingress只需要一个公网IP就能为许多服务提供访问.当客户端向Ingress发送Http请求时,Ingress会根据请求的主机名和路径决定请求转发到的服务.Ingress在网络栈的(HTTP)的应用层操作,并且可以提供一些服务不能实现的功能.如基于cookie的session affinity等功能.

`Traefik`反向代理

参考链接:
Traefik是一款云原生反向代理、负载均衡服务,使用golang实现的.和nginx最大的不同是,它支持自动化更新反向代理和负载均衡配置.并且支持多种后端:
- Docker / Swarm mode
- Kubernetes
- Marathon
- Rancher (Metadata)
- File

快速安装测试

下载一份Traefik的配置,可以从这里复制一份简单配置,或者去到https://github.com/traefik/traefik/releases下载一个Traefik的二进制的程序,运行如下命令生成它:
1
~$ ./traefik -c traefik.toml
直接Docker运行测试,浏览器打开http://127.0.0.1:8080/dashboard/#/进入控制台查看.它有几个重要的核心组件:
- Providers
- Entrypoints
- Routers
- Services
- Middlewares
  1
  ~$ docker run -d -p 8080:8080 -p 80:80 -v $PWD/traefik.toml:/etc/traefik/traefik.toml traefik

`docker-compose`测试说明

下面是用一个简单的组合配置测试,来直观理解说明Traefik的用途与使用场景,这里使用v2.5的版本,与旧的v1.2是有差别的.测试的目录结构如下:

~$ tree
.
├── minio
│   └── docker-compose.yml
├── traefik
│   ├── docker-compose-v2.yml
│   ├── docker-compose.yml
│   └── traefik.toml
└── whoami-app
    ├── docker-compose-v2.yml
    └── docker-compose.yml

运行Traefik实例

traefik$ cat docker-compose-v2.yml
version: '3'

services:
  reverse-proxy:
    # The official v2 Traefik docker image
    image: traefik:v2.5
    # Enables the web UI and tells Traefik to listen to docker
    command: --api.insecure=true --providers.docker
    ports:
      # The HTTP port
      - "80:80"
      # The Web UI (enabled by --api.insecure=true)
      - "8080:8080"
    volumes:
      # So that Traefik can listen to the Docker events
      - /var/run/docker.sock:/var/run/docker.sock
traefik$ docker-compose -f docker-compose-v2.yml up -d
Creating network "traefik_default" with the default driver
Creating traefik_reverse-proxy_1 ... done

运行3个测试实例(traefik/whoami)

whoami-app$ cat docker-compose-v2.yml
version: '3'
services:
  whoami:
    # A container that exposes an API to show its IP address
    image: traefik/whoami
    networks:
      - traefik_default
    labels:
      - traefik.docker.network=traefik_default
      - traefik.http.routers.whoami.rule=Host(`whoami.docker.localhost`)

networks:
  traefik_default:
    external: true

whoami-app$ docker-compose -f docker-compose-v2.yml up -d --scale whoami=3
Starting whoami-app_whoami_1 ... done
Creating whoami-app_whoami_2 ... done
Creating whoami-app_whoami_3 ... done

可以通过浏览器的页面http://127.0.0.1:8080/dashboard/#/http/services查看服务的状态,也可以通过如下命令查看.

~$ curl  http://localhost:8080/api/rawdata | jq -c '.services'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1829  100  1829    0     0  1786k      0 --:--:-- --:--:-- --:--:-- 1786k
{"api@internal":{"status":"enabled","usedBy":["api@internal"]},"dashboard@internal":{"status":"enabled","usedBy":["dashboard@internal"]},"noop@internal":{"status":"enabled"},"reverse-proxy-traefik@docker":{"loadBalancer":{"servers":[{"url":"http://172.30.0.2:80"}],"passHostHeader":true},"status":"enabled","usedBy":["reverse-proxy-traefik@docker"],"serverStatus":{"http://172.30.0.2:80":"UP"}},"whoami-whoami-app@docker":{"loadBalancer":{"servers":[{"url":"http://172.31.0.4:80"},{"url":"http://172.31.0.2:80"},{"url":"http://172.31.0.3:80"}],"passHostHeader":true},"status":"enabled","usedBy":["whoami@docker"],"serverStatus":{"http://172.31.0.2:80":"UP","http://172.31.0.3:80":"UP","http://172.31.0.4:80":"UP"}}}

或者是这样

~$ curl -H Host:whoami.docker.localhost http://127.0.0.1
Hostname: 4c9f9a107136
IP: 127.0.0.1
IP: 172.19.0.6
RemoteAddr: 172.19.0.1:46902
GET / HTTP/1.1
Host: whoami.docker.localhost
User-Agent: curl/7.74.0
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 172.19.0.1
X-Forwarded-Host: whoami.docker.localhost
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Forwarded-Server: 8a986b075043
X-Real-Ip: 172.19.0.1

开启证书与`Basic Auth`

下面是一个复杂的docker-compose,支持Let's Crypto自动申请证书。

version: "3.3"

services:
  traefik:
    image: "traefik:v2.5"
    container_name: "traefik"
    command:
      - "--api.insecure=false"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--providers.file.directory=/letsencrypt/"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.myresolver.acme.httpchallenge=true"
      - "--certificatesresolvers.myresolver.acme.httpchallenge.entrypoint=web"
      #- "--certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
      - "--certificatesresolvers.myresolver.acme.email=<your email>@gmail.com"
      - "--certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json"
      # Global HTTP -> HTTPS
      - "--entrypoints.web.http.redirections.entryPoint.to=websecure"
      - "--entrypoints.web.http.redirections.entryPoint.scheme=https"
      # Enable dashboard
      - "--api.dashboard=true"
    ports:
      - "80:80"
      - "443:443"
      - "8080:8080"
    volumes:
      - "./certs:/letsencrypt"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
    labels:
      - "traefik.enable=true"
      - "traefik.http.services.api@internal.loadbalancer.server.port=8080" # required by swarm but not used.
      - "traefik.http.routers.traefik.rule=Host(`<Your FQDN domain name>`) && (PathPrefix(`/dashboard`) || PathPrefix(`/api`))"
      - "traefik.http.routers.traefik.middlewares=traefik-https-redirect"
      - "traefik.http.routers.traefik.entrypoints=websecure"
      - "traefik.http.routers.traefik.tls.certresolver=myresolver"
      - "traefik.http.routers.traefik.tls=true"
      - "traefik.http.routers.traefik.tls.options=default"
      - "traefik.http.middlewares.traefik-https-redirect.redirectscheme.scheme=https"
      - "traefik.http.routers.traefik.middlewares=traefik-auth"
      - "traefik.http.middlewares.traefik-auth.basicauth.users=<login name>:$$apr1$$Pf2MP/Oy$$...."
      - "traefik.http.routers.traefik.service=api@internal"
        #- 'traefik.http.routers.traefik.middlewares=strip'
        #- 'traefik.http.middlewares.strip.stripprefix.prefixes=/dashboard'

  whoami:
    image: "traefik/whoami"
    container_name: "simple-service"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.whoami.rule=Host(`<Your FQDN domain name>`) && Path(`/whoami`)"
      - "traefik.http.routers.whoami.entrypoints=websecure"
      - "traefik.http.routers.whoami.tls.certresolver=myresolver"

上面的申请的证书是通过httpchallenge,也可以通过dns验证的方式, 并且开启了Traefik Dashboard的TLS+Basic Auth认证. 上面的whoami的服务,是用做示例说明,就此说明可以使用Host() && Path(/whoami)方式,去反向代理很多不同的内部服务。
如上面所示,使用了本地./certs:/letsencrypt挂载,所以certs目录内容如下。acme.json是证书相关的内容。

~$ tree ./certs/
./certs/
├── acme.json
└── tls.yml

创建tls.yml是为了加强TLS的设置级别,内容如下.可以通过https://www.ssllabs.com去测试当前服务器的TLS安全评分,如果能得到A+说明非常好.

~$ cat certs/tls.yml
tls:
  options:
    default:
      minVersion: "VersionTLS13"
      sniStrict: true
      cipherSuites:
        - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
        - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305

创建Basic Auth的用户与密码,一般都使用apache-utils工具里的htpasswd命令,也可以使用下面的方式创建,$需要转义成$$.

# -apr1  uses the apr1 algorithm (Apache variant of the BSD algorithm).

~$ openssl passwd -apr1
Password:
Verifying - Password:
$apr1$AzG7Y5HE$dZoKWVCmxffAe1oakeHR40

或者这样

~$ printf "<Your User>:$(openssl passwd -apr1 <Your password> | sed -E "s:[\$]:\$\$:g")\n"  >> ~/.htpasswd

~$ printf "admin:$(openssl passwd -apr1 admin | sed -E "s:[\$]:\$\$:g")\n"
admin:$$apr1$$7eSlrnJD$$XGLpWARS.YLxwYPoRtUdc.

`docker-compose`安装`MinIO`

运行一个稍微复杂的MinIO的服务实例

version: "3"

services:
  minio:
    # Please use fixed versions :D
    image: minio/minio
    hostname: minio
    networks:
      - traefik_default
    volumes:
      - $PWD/minio-data:/data
    command:
      - server
      - /data
      - --console-address
      - ":9001"
    expose:
      - 9000
      - 9001
    environment:
      - MINIO_ROOT_USER=minio
      - MINIO_ROOT_PASSWORD=minio123
      - APP_NAME=minio
      # Do NOT use MINIO_DOMAIN or MINIO_SERVER_URL with Traefik.
      # All Routing is done by Traefik, just tell minio where to redirect to.
      - MINIO_BROWSER_REDIRECT_URL=http://minio-console.localhost
    labels:
      - traefik.enable=true
      - traefik.docker.network=traefik_default
      - traefik.http.routers.minio.rule=Host(`minio.localhost`)
      - traefik.http.routers.minio-console.rule=Host(`minio-console.localhost`)
      - traefik.http.routers.minio.service=minio
      - traefik.http.services.minio.loadbalancer.server.port=9000
      - traefik.http.services.minio-console.loadbalancer.server.port=9001
      - traefik.http.routers.minio-console.service=minio-console

networks:
  traefik_default:
    external: true

minio$ docker-compose up -d
Creating minio_minio_1 ... done

运行上面实例后,可以通浏览器打开http://minio-console.localhost,输入minio:minio123登录到minIO的控制台页面.为什么会通过minio-console.localhost这样一个域名,就可以反向代理内部的服务了.这里可以打开http://localhost:8080/dashboard/#/http/routers看到相应的路由,也可以用下面命令查看有那些路由:

~$ curl http://localhost:8080/api/rawdata | jq   '.routers[] | .service'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2509    0  2509    0     0  2450k      0 --:--:-- --:--:-- --:--:-- 2450k
"api@internal"
"dashboard@internal"
"minio-console"
"minio"
"reverse-proxy-traefik"
"whoami-whoami-app"

这里测试时需要注意一个点,如果把minio与traefik等服务描述,写在同一个docker-compose.yml文件里,是不需要
指定networks段的,如果分开写的,是需要声明与traefik的服务在同一个网络域内,上面的测试实例中,
docker-compose启动traefik服务时,会默认创建一个名为traefik_default网络域.所以这里在上述
实例中:whoami,minio中,都声明定义了networks的字段,本机Docker网络列表如下:

~$ docker network ls
NETWORK ID     NAME                 DRIVER    SCOPE
f8c2befaa42f   bridge               bridge    local
f27beded9896   host                 host      local
244ab53cbc48   none                 null      local
18ddc4985478   traefik_default      bridge    local
1c5f5a863ef9   whoami-app_default   bridge    local

Traefik创建路由规则有多种方式,比如:
- 原生Ingress写法
- 使用CRD IngressRoute方式
- 使用GatewayAPI的方式

停掉测试实例

whoami-app$ docker-compose -f docker-compose-v2.yml down
Stopping whoami-app_whoami_1 ... done
Stopping whoami-app_whoami_3 ... done
Stopping whoami-app_whoami_2 ... done
Removing whoami-app_whoami_1 ... done
Removing whoami-app_whoami_3 ... done
Removing whoami-app_whoami_2 ... done
Removing network whoami-app_default

`MC`客户端

MinIO Client Complete Guide Slack

1
2
3

wget https://dl.min.io/client/mc/release/linux-amd64/mc
chmod +x mc
./mc --help

配置一个S3服务端

./mc config host add mystorage http://minio.localhost test1234access test1234secret --api s3v4
mc: Configuration written to `/home/michael/.mc/config.json`. Please update your access credentials.
mc: Successfully created `/home/michael/.mc/share`.
mc: Initialized share uploads `/home/michael/.mc/share/uploads.json` file.
mc: Initialized share downloads `/home/michael/.mc/share/downloads.json` file.
Added `mystorage` successfully.

查看信息

~$ ./mc admin info play/
●  play.min.io
   Uptime: 2 days
   Version: 2021-12-10T23:03:39Z
   Network: 1/1 OK
   Drives: 4/4 OK

11 GiB Used, 392 Buckets, 8,989 Objects
4 drives online, 0 drives offline

`S3`客户端连接测试`MinIO`

1	~$ pip3 install awscli

minio$ aws configure --profile minio
AWS Access Key ID [None]: test1234minio
AWS Secret Access Key [None]: test1234minio
Default region name [None]: minio-lan
Default output format [None]: json

`Helm`包管理器

Helm是k8s的包管理器,它可以类比为Debian,Ubuntu的apt.Red Hat的yum,Python中的pip.Nodejs的npm包管理器.Helm可以理解为Kubernetes的包管理工具,可以方便地发现、共享和使用为Kubernetes构建的应用,它包含几个基本概念:
- .Chart:一个Helm包,其中包含了运行一个应用所需要的镜像、依赖和资源定义等,还可能包含Kubernetes集群中的服务定义,类似Homebrew中的formula,APT的dpkg或者Yum的rpm文件.
- .Release:在Kubernetes集群上运行的Chart的一个实例.在同一个集群上,一个Chart可以安装很多次.每次安装都会创建一个新的Release. MySQL Chart,如果想在服务器上运行两个数据库,就可以把这个Chart安装两次.每次安装都会生成自己的Release,会有自己的Release名称.
- .Repository:用于发布和存储Chart的仓库.

# 下载一个最版本,把它解压到/usr/local/bin
~$ wget -c https://storage.googleapis.com/kubernetes-helm/helm-v2.13.1-linux-amd64.tar.gz

~$ helm version
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Error: could not find tiller

~$ helm completion bash > ~/.helmrc
echo "source ~/.helmrc" >> ~/.bashrc

安装`Tiller`服务器

利用Helm简化Kubernetes应用部署

~$ helm init
Creating /home/lcy/.helm
Creating /home/lcy/.helm/repository
Creating /home/lcy/.helm/repository/cache
Creating /home/lcy/.helm/repository/local
Creating /home/lcy/.helm/plugins
Creating /home/lcy/.helm/starters
Creating /home/lcy/.helm/cache/archive
Creating /home/lcy/.helm/repository/repositories.yaml
Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com
Error: Looks like "https://kubernetes-charts.storage.googleapis.com" is not a valid chart repository or cannot be reached: read tcp 172.18.127.186:54980->216.58.199.16:443: read: connection reset by peer

如上所示,无法连接到官方的服务器,国内利用阿里云源来安装

~$ helm init --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.13.1 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
Creating /home/lcy/.helm/repository/repositories.yaml
Adding stable repo with URL: https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
Adding local repo with URL: http://127.0.0.1:8879/charts
$HELM_HOME has been configured at /home/lcy/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!

~$ helm init --upgrade
$HELM_HOME has been configured at /home/lcy/.helm.

Tiller (the Helm server-side component) has been upgraded to the current version.
Happy Helming!
# 查找charts
~$ helm search
NAME                            CHART VERSION   APP VERSION     DESCRIPTION
stable/acs-engine-autoscaler    2.1.3           2.1.1           Scales worker nodes within agent pools
stable/aerospike                0.1.7           v3.14.1.2       A Helm chart for Aerospike in Kubernetes
stable/anchore-engine           0.1.3           0.1.6           Anchore container analysis and policy evaluation engine s...
stable/artifactory              7.0.3           5.8.4           Universal Repository Manager supporting all major packagi...
stable/artifactory-ha           0.1.0           5.8.4           Universal Repository Manager supporting all major packagi...
[...]
# 更新仓库
~$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "stable" chart repository
Update Complete. ⎈ Happy Helming!⎈

# 仓库地址
~$ helm repo list
NAME  	URL
stable	https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
local 	http://127.0.0.1:8879/charts

使用`Helm Chart`部署`MinIO`

MinIO中文手册
默认standaline模式下,需要开启Beta API的Kubernetes 1.4+.如果没有出错就会运行成功,如下所示.
如果出错,按照后面的错误处理.
accessKey默认access key AKIAIOSFODNN7EXAMPLE
secretKey默认secret key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

~$ helm install stable/minio
NAME:   snug-elk
LAST DEPLOYED: Mon Apr 15 14:07:10 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME                      DATA  AGE
snug-elk-minio-config-cm  2     0s

==> v1/PersistentVolumeClaim
NAME            STATUS   VOLUME  CAPACITY  ACCESS MODES  STORAGECLASS  AGE
snug-elk-minio  Pending  0s

==> v1/Pod(related)
NAME                             READY  STATUS   RESTARTS  AGE
snug-elk-minio-7b9878bb66-mmx9n  0/1    Pending  0         0s

==> v1/Secret
NAME                 TYPE    DATA  AGE
snug-elk-minio-user  Opaque  2     0s

==> v1/Service
NAME                TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)         AGE
snug-elk-minio-svc  LoadBalancer  10.103.48.186  <pending>    9000:32076/TCP  0s

==> v1beta1/Deployment
NAME            READY  UP-TO-DATE  AVAILABLE  AGE
snug-elk-minio  0/1    1           0          0s


NOTES:

Minio can be accessed via port 9000 on an external IP address. Get the service external IP address by:
kubectl get svc --namespace default -l app=snug-elk-minio

Note that the public IP may take a couple of minutes to be available.

You can now access Minio server on http://<External-IP>:9000. Follow the below steps to connect to Minio server with mc client:

  1. Download the Minio mc client - https://docs.minio.io/docs/minio-client-quickstart-guide

  2. mc config host add snug-elk-minio-local http://<External-IP>:9000 AKIAIOSFODNN7EXAMPLE wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY S3v4

  3. mc ls snug-elk-minio-local

Alternately, you can use your browser or the Minio SDK to access the server - https://docs.minio.io/categories/17

查看Release对象

~$ kubectl get service snug-elk-minio-svc
NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
snug-elk-minio-svc   LoadBalancer   10.103.48.186   <pending>     9000:32076/TCP   3m7s

~$ kubectl  get pods
NAME                           READY   STATUS    RESTARTS   AGE
store-minio-75bb89c596-74nz9   0/1     Pending   0          17m

~$ kubectl   describe pod  store-minio-75bb89c596-74nz9
Name:               store-minio-75bb89c596-74nz9
[...]

~$ helm list
NAME            REVISION        UPDATED                         STATUS          CHART           APP VERSION     NAMESPACE
snug-elk        1               Mon Apr 15 14:07:10 2019        DEPLOYED        minio-0.5.5                     default

安装`MinIO`客户端

按照上面安装服务器的NOTES,安装与配置它的命令行客户端.

~$ wget https://dl.minio.io/client/mc/release/linux-amd64/mc
~$ sudo mv mc /usr/local/bin && chmod +x /usr/local/bin/mc
~$ kubectl get svc --namespace default -l app=snug-elk-minio
NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
snug-elk-minio-svc   LoadBalancer   10.103.48.186   <pending>     9000:32076/TCP   25h

安装集群监控`DashBoard`

1	~$ kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml

查看安装是否成功.

~$ kubectl  --namespace=kube-system get pod -l k8s-app=kubernetes-dashboard
NAME                                    READY   STATUS             RESTARTS   AGE
kubernetes-dashboard-5f7b999d65-m2zmt   0/1     ImagePullBackOff   0          10m

~$ kubectl  --namespace=kube-system describe  pod -l k8s-app=kubernetes-dashboard | grep "Events" -A +10
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  14m                  default-scheduler  Successfully assigned kube-system/kubernetes-dashboard-5f7b999d65-m2zmt to dig001
  Normal   Pulling    11m (x4 over 14m)    kubelet, dig001    Pulling image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1"
  Warning  Failed     11m (x4 over 13m)    kubelet, dig001    Failed to pull image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1": rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     11m (x4 over 13m)    kubelet, dig001    Error: ErrImagePull
  Warning  Failed     11m (x6 over 13m)    kubelet, dig001    Error: ImagePullBackOff
  Normal   BackOff    4m5s (x36 over 13m)  kubelet, dig001    Back-off pulling image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1"

根kubectl -n kube-system delete $(kubectl -n kube-system get pod -o name | grep dashboard)下面修改它的镜像地址,替换成国内的阿里云的地址.
使用kubectl edit deployment/kubernetes-dashboard -n kube-system打开编辑,把k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1替换成registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1,保存退出.

# 修改后,安装成了.
~$ kubectl  --namespace=kube-system get deployment -l k8s-app=kubernetes-dashboard
NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
kubernetes-dashboard   1/1     1            1           51m
~$ kubectl  --namespace=kube-system get pod -l k8s-app=kubernetes-dashboard
NAME                                    READY   STATUS    RESTARTS   AGE
kubernetes-dashboard-5d9599dc98-gj8w7   1/1     Running   0          99s
~$ kubectl  --namespace=kube-system get svc -l k8s-app=kubernetes-dashboard
NAME                   TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
kubernetes-dashboard   ClusterIP   10.107.85.75   <none>        443/TCP   49m

`Proxy`加`SSH`转发访问方式

# k8s的服务器上
~$ kubectl  proxy
Starting to serve on 127.0.0.1:8001
# 控制主机上运行下面命令,再从浏览器里打开http://127.0.0.1:8001/就能访问到DashBoard
~$ ssh 8001:localhost:8001 <user@k8s-server> -Nf

修改`Service`端口类型

通过kubectl --namespace=kube-system edit svc kubernetes-dashboard打开编辑,把type: ClusterIP替换成type: NodePort.

~$ kubectl --namespace=kube-system edit svc kubernetes-dashboard
~$ kubectl --namespace=kube-system get  svc kubernetes-dashboard
NAME                   TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kubernetes-dashboard   NodePort   10.107.85.75   <none>        443:31690/TCP   65m

如上所示可以通https://<service-host>:31690/访问到DashBoard.

访问认证

Creating sample user
Access control
第一次打开DashBoard会提示两种登录方式,Kubeconfig与Token.下面参照Creating sample user安装一下.

创建服务帐号

~$ cat admin-user.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kube-system
~$ kubectl create -f admin-user.yaml

~$ kubectl -n kube-system describe secret $(kubectl -n kube-system  get secret | grep admin | awk '{print $1}')
Name:         admin-user-token-7dj2c
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: admin-user
              kubernetes.io/service-account.uid: a65538aa-673b-11e9-b8a2-00163e027e39

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1025 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLTdkajJjIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhNjU1MzhhYS02NzNiLTExZTktYjhhMi0wMDE2M2UwMjdlMzkiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.eW_YSgn_kTIQDcbB51k8HaY9LABeFg5mFFPJykYgsoyxZH_b80WEcDZn4Z4Ix2BJvhK1sBESfSa_Qn1yN5pcIzUMROIYvBGZBSnMmw2VsSpQMUTJ1ha43ql-GKCz15ro1VrhyeWeCtiVTILA0Z0DwfgO2skjY2x1KO_76sDR7r66frZDjGmgYTm-b3E6RdcETB41Wjjuj-nt3b3ZblkBr3QDKP-tlvnW_nr7LcmgF7etjU8qK_W3fj-LB_BnWRpRiamQeXLNJuC-Dq42x00gAzQuVg17rDcEiKxJWmDYYsojvm7Xg0fSwXLCdBfgysYCz5PMR05dT0QU0iYO7z_Cow

打开DashBoard首页,填入上面的token值.

创建集群角色绑定

~$ cat dashboard-admin.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
  labels:
    k8s-app: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kube-system

~$ kubectl create -f dashboard-admin.yaml

~$ kubectl -n kube-system describe secret $(kubectl -n kube-system  get secret | grep "kubernetes-dashboard-token" |awk '{print $1}')
Name:         kubernetes-dashboard-token-ftk96
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: kubernetes-dashboard
              kubernetes.io/service-account.uid: ba35db4d-672d-11e9-b8a2-00163e027e39

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1025 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi1mdGs5NiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImJhMzVkYjRkLTY3MmQtMTFlOS1iOGEyLTAwMTYzZTAyN2UzOSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.gnspFBWcou3-EgMsSPPQqSt1fwWne6tLCNgHa0yrQLEH9DDRgDQh1mBDne4Z2M-s3FPlTV9DI77QneanA12jHrpLHRohQSlyiz8Pv3xa7JRb7Hfyj5PhbSlX2KtTbOlVvAdlFttFi3vw-fbUJWcALEmogwa7jnlR233slJLjZ8nAA9xsE-gr4_zYmZ2VhYGfH0dAs2H2aCklRl2Sy5VQpoDlGjKH82-FcCrLwGQyLpAA9tr0H7pivGIFqO46PWR0aBLiT1BBkmjoQJkDPy0qRxi90nG1WyFnDLHYK6BRDTZ4G-J3QhAiAK0su-7i6rJhMKm-FbnYXULIstW1LyO4tg

更新`Dashboard`

安装Dashboard它不会自动更新复盖本地旧的版本,必须要手动清除的旧的版本再安装新的版本.
1
~$ kubectl -n kube-system delete $(kubectl -n kube-system get pod -o name | grep dashboard)

官方快速入门实例教程

https://github.com/kubernetes/examples

`Guestbook`实例

~$ git clone https://github.com/kubernetes/examples
~$ cd exmaples
~$ tree -L 1
.
├── cassandra
├── code-of-conduct.md
├── CONTRIBUTING.md
├── guestbook
├── guestbook-go
├── guidelines.md
├── LICENSE
├── mysql-wordpress-pd
├── OWNERS
├── README.md
├── SECURITY_CONTACTS
└── staging

5 directories, 7 files

安装`Redis Master Pod`

~$ cd examples/guestbook$ && ls *.yaml
frontend-deployment.yaml  redis-master-deployment.yaml  redis-slave-deployment.yaml
frontend-service.yaml     redis-master-service.yaml     redis-slave-service.yaml
~$ cat redis-master-deployment.yaml
apiVersion: apps/v1 #  for k8s versions before 1.9.0 use apps/v1beta2  and before 1.8.0 use extensions/v1beta1
kind: Deployment
metadata:
  name: redis-master
spec:
  selector:
    matchLabels:
      app: redis
      role: master
      tier: backend
  replicas: 1
  template:
    metadata:
      labels:
        app: redis
        role: master
        tier: backend
    spec:
      containers:
      - name: master
        # image: k8s.gcr.io/redis:e2e  # or just image: redis
        image: forestgun007/redis:e2e
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        ports:
        - containerPort: 6379

注意上面的k8.gcr.io在国内是不能访问的,所以要修改redis-master-deployment.yaml里的image,
这里通过docker search redis:e2e查询到一些mirror的镜像文件.

~$ docker search redis:e2e
NAME                       DESCRIPTION                          STARS               OFFICIAL            AUTOMATED
forestgun007/redis         gcr.io/google_containers/redis:e2e   1                                       [OK]
will835559313/gcr_redis    gcr.io/google_containers/redis:e2e   0                                       [OK]
smallguitar/redis-master   gcr.io/google_containers/redis:e2e   0                                       [OK]

~$ kubectl apply -f redis-master-deployment.yaml
~$ kubectl get pods
redis-slave-555b8847c4-mttt9    1/1     Running   0          16h

安装`Redis Master Service`

~$ cat redis-master-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: redis-master
  labels:
    app: redis
    role: master
    tier: backend
spec:
  ports:
  - port: 6379
    targetPort: 6379
  selector:
    app: redis
    role: master
    tier: backend
~$ kubectl apply -f redis-master-service.yaml
~$ kubectl get service redis-master
NAME           TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
redis-master   ClusterIP   10.106.23.86   <none>        6379/TCP   17h

安装`Redis Slave Pod`

$ cat redis-slave-deployment.yaml
apiVersion: apps/v1 #  for k8s versions before 1.9.0 use apps/v1beta2  and before 1.8.0 use extensions/v1beta1
kind: Deployment
metadata:
  name: redis-slave
spec:
  selector:
    matchLabels:
      app: redis
      role: slave
      tier: backend
  replicas: 2
  template:
    metadata:
      labels:
        app: redis
        role: slave
        tier: backend
    spec:
      containers:
      - name: slave
        # gcr.io/google_samples/gb-redisslave:v1 同样需要改成下面这个镜像
        image: forestgun007/gb-redisslave:v1
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        env:
        - name: GET_HOSTS_FROM
          value: dns
          # If your cluster config does not include a dns service, then to
          # instead access an environment variable to find the master
          # service's host, comment out the 'value: dns' line above, and
          # uncomment the line below:
          # value: env
        ports:
        - containerPort: 6379

~$ kubectl apply -f redis-slave-deployment.yaml
~$ kubectl get pods
NAME                            READY   STATUS    RESTARTS   AGE
redis-slave-555b8847c4-mttt9    1/1     Running   0          17h
redis-slave-555b8847c4-r24xx    1/1     Running   0          17h

安装`Redis Slave Service`

~$ kubectl apply -f  redis-slave-service.yaml

~$ kubectl get svc redis-slave
NAME          TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
redis-slave   ClusterIP   10.103.39.53   <none>        6379/TCP   17h

安装`Frontend Pod`

~$ cat frontend-deployment.yaml
apiVersion: apps/v1 #  for k8s versions before 1.9.0 use apps/v1beta2  and before 1.8.0 use extensions/v1beta1
kind: Deployment
metadata:
  name: frontend
spec:
  selector:
    matchLabels:
      app: guestbook
      tier: frontend
  replicas: 3
  template:
    metadata:
      labels:
        app: guestbook
        tier: frontend
    spec:
      containers:
      - name: php-redis
        # image: gcr.io/google-samples/gb-frontend:v4
        image: forestgun007/google-samples-gb-frontend:v4
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        env:
        - name: GET_HOSTS_FROM
          value: dns
          # If your cluster config does not include a dns service, then to
          # instead access environment variables to find service host
          # info, comment out the 'value: dns' line above, and uncomment the
          # line below:
          # value: env
        ports:
        - containerPort: 80

~$ kubectl apply -f frontend-deployment.yaml

~$ kubectl get pod
NAME                            READY   STATUS    RESTARTS   AGE
frontend-6f4cc58c94-2wv5l       1/1     Running   0          17h
frontend-6f4cc58c94-s6s8l       1/1     Running   0          17h
frontend-6f4cc58c94-z9qmk       1/1     Running   0          17h

安装`Frontend Service`

~$ kubectl apply -f  frontend-service.yaml
~$ kubectl get service frontend
NAME       TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
frontend   NodePort   10.106.20.24   <none>        80:30577/TCP   17h

错误处理

连接证书错误

1
2

~$ kubectl get node
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

上述错误,一般是在kubeadm reset后,没有更新~/.kube/config的文件发生.cp /etc/kubernetes/admin.conf ~/.kube/conf就可以解决.

加入节点错误

~$ sudo kubeadm join 172.18.127.186:6443 --token z8r97j.3ovdfddb6df9lnq7     --discovery-token-ca-cert-hash sha256:07767a67fa6c38feda7471ee5e1a15a0a9c417cfdf6cf457ff577297f22d9415
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: couldn't validate the identity of the API Server: abort connecting to API servers after timeout of 5m0s

如果加入节点时间很长,且最后还出错了,加参数-v=6,就会出现如下错误:

~$ sudo kubeadm join 172.18.127.186:6443 --token z8r97j.3ovdfddb6df9lnq7     --discovery-token-ca-cert-hash sha256:07767a67fa6c38feda7471ee5e1a15a0a9c417cfdf6cf457ff577297f22d9415 -v=6

[...]
I0414 16:23:00.398178   13202 token.go:200] [discovery] Trying to connect to API Server "172.18.127.186:6443"
I0414 16:23:00.398724   13202 token.go:75] [discovery] Created cluster-info discovery client, requesting info from "https://172.18.127.186:6443"
I0414 16:23:00.402234   13202 round_trippers.go:438] GET https://172.18.127.186:6443/api/v1/namespaces/kube-public/configmaps/cluster-info 200 OK in 3 milliseconds
I0414 16:23:00.402426   13202 token.go:203] [discovery] Failed to connect to API Server "172.18.127.186:6443": token id "z8r97j" is invalid for this cluster or it has expired. Use "kubeadm token create" on the control-plane node to create a new valid token
[...]

如果出现上述错误,在Master节点输入下面命令,用它输出的kubeadm join参数,在需要添加的节点为上运行.

1
2
3

~$ kubeadm  token create --print-join-command
# 使用下述输出,重新添加.
kubeadm join 172.18.127.186:6443 --token in1l6v.ue78pr5vvr55qcad     --discovery-token-ca-cert-hash sha256:a1f80db7a76e214dd529fc2aed660d71428994d9104c1b320bf5abb6cda4b165

安装`charts`错误

1 2	~$ helm install stable/minio Error: could not find a ready tiller pod

第一步,更新一下仓库

~$ helm init --upgrade
$HELM_HOME has been configured at /home/lcy/.helm.

Tiller (the Helm server-side component) has been upgraded to the current version.
Happy Helming!

~$ helm repo list
NAME    URL
stable  https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
local   http://127.0.0.1:8879/charts

查看k8s的子系统Pod的状态,查看tiller-deploy部分.

~$ kubectl  -n kube-system get po
NAME                                       READY   STATUS         RESTARTS   AGE
calico-kube-controllers-5cbcccc885-krbzj   1/1     Running        0          17h
[]...]
kube-scheduler-k8s-master                  1/1     Running        0          18h
tiller-deploy-c48485567-m7kj2              0/1     ErrImagePull   0          2m50s

根据上述输出,tiller-deploy拉取镜像失败,没有运行起,下面再查看详情.

~$ kubectl  describe pod tiller-deploy-c48485567-m7kj2   -n kube-system
Name:               tiller-deploy-c48485567-m7kj2
[...]
Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  16m                 default-scheduler  Successfully assigned kube-system/tiller-deploy-c48485567-m7kj2 to dig001
  Normal   Pulling    14m (x4 over 16m)   kubelet, dig001    Pulling image "gcr.io/kubernetes-helm/tiller:v2.13.1"
  Warning  Failed     13m (x4 over 16m)   kubelet, dig001    Failed to pull image "gcr.io/kubernetes-helm/tiller:v2.13.1": rpc error: code = Unknown desc = Error response from daemon: Get https://gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     13m (x4 over 16m)   kubelet, dig001    Error: ErrImagePull
  Warning  Failed     13m (x7 over 16m)   kubelet, dig001    Error: ImagePullBackOff
  Normal   BackOff    82s (x57 over 16m)  kubelet, dig001    Back-off pulling image "gcr.io/kubernetes-helm/tiller:v2.13.1"

因为使用要gcr.io的仓库造成拉取失败,下面通过docker search tiller | grep "Mirror"选取一个,再通过下面命令修改它.
1
2
~$ kubectl edit deploy tiller-deploy -n kube-system
[....]
将上述中的image gcr.io/kubernetes-helm/tiller:v2.13.1 替换成image: sapcc/tiller:v2.13.1,下面再运行就会显示tiller成功运行.

~$ kubectl get pod -n kube-system | grep "tiller"
tiller-deploy-b7bd9495c-bf777              1/1     Running   0          2m57s
~$ helm version
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}

Error: no available release name found

~$ kubectl create serviceaccount --namespace kube-system tiller
~$ kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
clusterrolebinding.rbac.authorization.k8s.io/tiller-cluster-rule created
~$ kubectl patch deploy --namespace kube-system tiller-deploy -p \'{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}\'
deployment.extensions/tiller-deploy patched

`kubeadm`初始化错误

~$ sudo kubeadm init --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers  --kubernetes-version v1.14.1 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.14.1
[preflight] Running pre-flight checks
[preflight] WARNING: Couldn\'t create the interface used for talking to the container runtime: docker is required for container runtime: exec: "docker": executable file not found in $PATH
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
	[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
	[ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with  `--ignore-preflight-errors=...`

ERROR NumCPU加入运行参数--ignore-preflight-errors=NumCPU就变成警告了.

FileContent--proc-sys-net-bridge-bridge-nf-call-iptables错误处理如下:

~$ apt-get install bridge-utils
# 有可能要重启
~$ modprobe bridge
~$ modprobe br_netfilter

~$ cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward=1
EOF

~$ sysctl --system

kube-proxy与iptables的问题

1 2	~$ kubectl -n kube-system logs kube-proxy-xxx W0514 00:21:27.445425 1 server_others.go:267] Flag proxy-mode="" unknown, assuming iptables proxy

cat mysql-pass.yaml

apiVersion: v1
kind: Secret
metadata:
  name: mysql-pass
type: Opaque
data:
  username: cm9vdA==
  password: cGFzczEyMw==

谢谢支持

微信二维码:

玩转FPGA_DE0-Nano

发表于 2021-02-02 更新于 2022-03-19

Links:
- OpenRISC

开板板简介

管脚定义

下面数据来源于DE0_Nano_User_Manual.pdf的Chapter 3-3.5 Expansion headers.

GPIO-0 Pin Assignments

Signal Name	FPGA Pin No.	Descritption	I/O Standard
GPIO_0_IN0	PIN_A8	GPIO Connection DATA	3.3V
GPIO_00	PIN_D3	GPIO Connection DATA	3.3V
GPIO_0_IN1	PIN_B8	GPIO Connection DATA	3.3V
GPIO_01	PIN_C3	GPIO Connection DATA	3.3V
GPIO_02	PIN_A2	GPIO Connection DATA	3.3V
GPIO_03	PIN_A3	GPIO Connection DATA	3.3V
GPIO_04	PIN_B3	GPIO Connection DATA	3.3V
GPIO_05	PIN_B4	GPIO Connection DATA	3.3V
GPIO_06	PIN_A4	GPIO Connection DATA	3.3V
GPIO_07	PIN_B5	GPIO Connection DATA	3.3V
GPIO_08	PIN_A5	GPIO Connection DATA	3.3V
GPIO_09	PIN_D5	GPIO Connection DATA	3.3V
GPIO_010	PIN_B6	GPIO Connection DATA	3.3V
GPIO_011	PIN_A6	GPIO Connection DATA	3.3V
GPIO_012	PIN_B7	GPIO Connection DATA	3.3V
GPIO_013	PIN_D6	GPIO Connection DATA	3.3V
GPIO_014	PIN_A7	GPIO Connection DATA	3.3V
GPIO_015	PIN_C6	GPIO Connection DATA	3.3V
GPIO_016	PIN_C8	GPIO Connection DATA	3.3V
GPIO_017	PIN_E6	GPIO Connection DATA	3.3V
GPIO_018	PIN_E7	GPIO Connection DATA	3.3V
GPIO_019	PIN_D8	GPIO Connection DATA	3.3V
GPIO_020	PIN_E8	GPIO Connection DATA	3.3V
GPIO_021	PIN_F8	GPIO Connection DATA	3.3V
GPIO_022	PIN_F9	GPIO Connection DATA	3.3V
GPIO_023	PIN_E9	GPIO Connection DATA	3.3V
GPIO_024	PIN_C9	GPIO Connection DATA	3.3V
GPIO_025	PIN_D9	GPIO Connection DATA	3.3V
GPIO_026	PIN_E11	GPIO Connection DATA	3.3V
GPIO_027	PIN_E10	GPIO Connection DATA	3.3V
GPIO_028	PIN_C11	GPIO Connection DATA	3.3V
GPIO_029	PIN_B11	GPIO Connection DATA	3.3V
GPIO_030	PIN_A12	GPIO Connection DATA	3.3V
GPIO_031	PIN_D11	GPIO Connection DATA	3.3V
GPIO_032	PIN_D12	GPIO Connection DATA	3.3V
GPIO_033	PIN_B12	GPIO Connection DATA	3.3V

GPIO-1 Pin Assignments

Signal Name	FPGA Pin No.	Descritption	I/O Standard
GPIO_1_IN0	PIN_T9	GPIO Connection DATA	3.3V
GPIO_10	PIN_F13	GPIO Connection DATA	3.3V
GPIO_1_IN1	PIN_R9	GPIO Connection DATA	3.3V
GPIO_11	PIN_T15	GPIO Connection DATA	3.3V
GPIO_12	PIN_T14	GPIO Connection DATA	3.3V
GPIO_13	PIN_T13	GPIO Connection DATA	3.3V
GPIO_14	PIN_R13	GPIO Connection DATA	3.3V
GPIO_15	PIN_T12	GPIO Connection DATA	3.3V
GPIO_16	PIN_R12	GPIO Connection DATA	3.3V
GPIO_17	PIN_T11	GPIO Connection DATA	3.3V
GPIO_18	PIN_T10	GPIO Connection DATA	3.3V
GPIO_19	PIN_R11	GPIO Connection DATA	3.3V
GPIO_110	PIN_P11	GPIO Connection DATA	3.3V
GPIO_111	PIN_R10	GPIO Connection DATA	3.3V
GPIO_112	PIN_N12	GPIO Connection DATA	3.3V
GPIO_113	PIN_P9	GPIO Connection DATA	3.3V
GPIO_114	PIN_N9	GPIO Connection DATA	3.3V
GPIO_115	PIN_N11	GPIO Connection DATA	3.3V
GPIO_116	PIN_L16	GPIO Connection DATA	3.3V
GPIO_117	PIN_K16	GPIO Connection DATA	3.3V
GPIO_118	PIN_R16	GPIO Connection DATA	3.3V
GPIO_119	PIN_L15	GPIO Connection DATA	3.3V
GPIO_120	PIN_P15	GPIO Connection DATA	3.3V
GPIO_121	PIN_P16	GPIO Connection DATA	3.3V
GPIO_122	PIN_R14	GPIO Connection DATA	3.3V
GPIO_123	PIN_N16	GPIO Connection DATA	3.3V
GPIO_124	PIN_N15	GPIO Connection DATA	3.3V
GPIO_125	PIN_P14	GPIO Connection DATA	3.3V
GPIO_126	PIN_L14	GPIO Connection DATA	3.3V
GPIO_127	PIN_N14	GPIO Connection DATA	3.3V
GPIO_128	PIN_M10	GPIO Connection DATA	3.3V
GPIO_129	PIN_L13	GPIO Connection DATA	3.3V
GPIO_130	PIN_J16	GPIO Connection DATA	3.3V
GPIO_131	PIN_K15	GPIO Connection DATA	3.3V
GPIO_132	PIN_J13	GPIO Connection DATA	3.3V
GPIO_133	PIN_J14	GPIO Connection DATA	3.3V

Table 3-8 Pin Assignments for 2x13 Header

Signal Name	FPGA Pin No.	Descritption	I/O Standard
GPIO_2[0]	PIN_A14	GPIO Connection DATA[0]	3.3V
GPIO_2[1]	PIN_B16	GPIO Connection DATA[1]	3.3V
GPIO_2[2]	PIN_C14	GPIO Connection DATA[2]	3.3V
GPIO_2[3]	PIN_C16	GPIO Connection DATA[3]	3.3V
GPIO_2[4]	PIN_C15	GPIO Connection DATA[4]	3.3V
GPIO_2[5]	PIN_D16	GPIO Connection DATA[5]	3.3V
GPIO_2[6]	PIN_D15	GPIO Connection DATA[6]	3.3V
GPIO_2[7]	PIN_D14	GPIO Connection DATA[7]	3.3V
GPIO_2[8]	PIN_F15	GPIO Connection DATA[8]	3.3V
GPIO_2[9]	PIN_F16	GPIO Connection DATA[9]	3.3V
GPIO_2[10]	PIN_F14	GPIO Connection DATA[10]	3.3V
GPIO_2[11]	PIN_G16	GPIO Connection DATA[11]	3.3V
GPIO_2[12]	PIN_G15	GPIO Connection DATA[12]	3.3V
GPIO_2_IN[0]	PIN_E15	GPIO Input	3.3V
GPIO_2_IN[1]	PIN_E16	GPIO Input	3.3V
GPIO_2_IN[2]	PIN_M16	GPIO Input	3.3V

Table 3-9 Pin Assignments for ADC

Signal Name	FPGA Pin No.	Descritption	I/O Standard
ADC_CS_N	PIN_A10	Chip select	3.3V
ADC_SADDR	PIN_B10	Digital data input	3.3V
ADC_SDAT	PIN_A9	Digital data output	3.3V
ADC_SCLK	PIN_B14	Digital clock input	3.3V

JTAG

OpenRisc

GCC工具链

or1k-toolchain-build

1
2
3

~$ git clone https://github.com/stffrdhrn/or1k-toolchain-build
~$ cd or1k-toolchain-build
~$ docker build -t or1k-toolchain-build or1k-toolchain-build/

设置挂载目录变量,运行容器编译.如果build-gcc.sh内的资源链接失效了,需要找一个替代修改它,如：QEMU_URL=https://github.com/vamanea/qemu-or32/archive/v2.0.2.tar.gz.

# The location where you have tarballs, so they dont need to be
# downloaded every time you build
CACHEDIR=/home/user/work/docker/volumes/src
# The location where you want your output to go
OUTPUTDIR=/home/user/work/docker/volumes/crosstool

docker run -it --rm \
  -e MUSL_ENABLED=1 \
  -e NEWLIB_ENABLED=1 \
  -e NOLIB_ENABLED=1 \
  -e GCC_VERSION=9.0.1 \
  -e BINUTILS_VERSION=2.32.51 \
  -e LINUX_HEADERS_VERSION=4.19.1 \
  -e MUSL_VERSION=1.1.20 \
  -e GMP_VERSION=6.1.2 \
  -v ${OUTPUTDIR}:/opt/crosstool:Z \
  -v ${CACHEDIR}:/opt/crossbuild/cache:Z \
    or1k-toolchain-build

编译成功后,如下：

ls
or1k-elf-9.0.1-20210204.tar.xz
or1k-elf-gcc-9.0.1-20210204.log.gz
or1k-elf-gcc-9.0.1-20210204.sum
or1k-elf-gxx-9.0.1-20210204.log.gz
or1k-elf-gxx-9.0.1-20210204.sum
or1k-linux-9.0.1-20210204.tar.xz
or1k-linux-musl-9.0.1-20210204.tar.xz
or1k-linux-musl-gcc-9.0.1-20210203.log.gz
or1k-linux-musl-gcc-9.0.1-20210203.sum
or1k-linux-musl-gcc-9.0.1-20210204.log.gz
or1k-linux-musl-gcc-9.0.1-20210204.sum
or1k-linux-musl-gxx-9.0.1-20210203.log.gz
or1k-linux-musl-gxx-9.0.1-20210203.sum
or1k-linux-musl-gxx-9.0.1-20210204.log.gz
or1k-linux-musl-gxx-9.0.1-20210204.sum
relnotes-9.0.1-20210204.md

这里把or1k-linux-9.0.1-20210204.tar.xz解压安装到本地,并设置相应的环境变量如下：

export ALTERA_PATH="/home/michael/3TB-DISK/intelFPGA_lite/20.1/"
export PATH=$PATH:$ALTERA_PATH/quartus/bin

export ARCH=openrisc
export CROSS_COMPILE=or1k-linux-
export PATH=$PATH:`pwd`/toolchain-rootfs/or1k-linux/bin

或者单独编译or1k-gcc

~$ git clone https://github.com/openrisc/or1k-gcc
~$ cd or1k-gcc/
~$ mkdir build-linux
~$ cd build-linux && ../configure && make -j4
~$ make install DESTPATH=<absolute  path>

编译`ORPSOC`

~$ git clone https://github.com/mczerski/orpsoc-de0_nano
~$ export ALTERA_PATH="/home/fullpath/QuartusIIWebEdition13.0.0.156/quartus"
~$ export PATH=$PATH:$ALTERA_PATH/bin
~$ make OR32_TOOL_PREFIX=or1k-linux- all

编译Linux

~$ tar xvf linux-4.16.14.tar.xz
~$ cd linux-4.16.14
~$ wget -c https://kevinmehall.net/openrisc/guide/de0_nano.dts.txt -O arch/openrisc/boot/dts/de0_nano.dts

~$ make ARCH=openrisc CROSS_COMPILE="or1k-linux-" or1ksim_defconfig
/** Select Processor type and Features -> Builtin DTB and type de0_nano */
~$ make ARCH=openrisc CROSS_COMPILE="or1k-linux-" menuconfig
~$ make ARCH=openrisc CROSS_COMPILE="or1k-linux-"

烧写入bitstream

1
2
3

~$ cd orpsoc/boards/altera/de0_nano/syn/quartus/run
~$ make OR32_TOOL_PREFIX=or1k-linux- all
~$ make pgm

连接串号

根据上面GPIO的管脚定义,以及下面的信息,连接正确的rx,tx.

cat boards/altera/de0_nano/syn/quartus/tcl/UART0_pin_assignments.tcl
set_location_assignment PIN_D8 -to uart0_srx_pad_i
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to uart0_srx_pad_i
set_location_assignment PIN_F8 -to uart0_stx_pad_o
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to uart0_stx_pad_o

FuseSOC 试用

安装fuseSoc


~$ git clone https://github.com/olofk/fusesoc

~$ cd fusesoc && pip install -e .

OR
~$ pip install fusesoc

安装fusesoc 库

从网络安装.

1	~$ fusesoc library add intgen https://github.com/openrisc/intgen.git

从本地安装

~$ git clone https://github.com/openrisc/mor1kx-generic.git
~$ git clone https://github.com/openrisc/or1k_marocchino.git

~$ fusesoc library add mor1kx-generic `pwd`/mor1kx-generic
~$ fusesoc library add or1k_marocchino `pwd`/or1k_marocchino

~$ fusesoc list-cores
Available cores:

Core                      Cache status  Description
================================================================================
::blinky:0               :      local : <No description>
::intgen:0               :      local : Interrupt Generator For testing Processors
::mor1kx-generic:1.1     :      local : Minimal mor1kx simulation environment
::or1k_marocchino:5.0-r3 :      local : <No description>
::plights:0              :      local : <No description>
::rv_sopc:0              :      local : RISC V system on programmable chip example
::wb_intercon_gen_ng:0   :      local : CAPI=2 .core file description based Wishbone Interconnect generator

查看fusesoc.conf配置.

~$ cat fusesoc.conf
[library.mor1kx-generic]
location = /fullpath/FPGA-DE0-Nano/openrisc/mor1kx-generic
sync-uri = /fullpath/FPGA-DE0-Nano/openrisc/mor1kx-generic
sync-type = local
auto-sync = true

[library.or1k_marocchino]
location = /fullpath/FPGA-DE0-Nano/openrisc/or1k_marocchino
sync-uri = /fullpath/FPGA-DE0-Nano/openrisc/or1k_marocchino
sync-type = local
auto-sync = true

[library.intgen]
location = /fullpath/FPGA-DE0-Nano/openrisc/intgen
sync-uri = /fullpath/FPGA-DE0-Nano/openrisc/intgen
sync-type = local
auto-sync = true

[library.fusesoc-demos]
location = fusesoc_libraries/fusesoc-demos
sync-uri = https://github.com/Oxore/fusesoc-demos
sync-type = git
auto-sync = true

Fusesoc-demos

1	fusesoc library add https://github.com/Oxore/fusesoc-demos

RISC-V

links:
- linux-on-litex-vexriscv
- VexRiscv

安装`Quartus20`

这里只是安装QuartusLiteSetup-20.1.1.720-linux.run再加两个设备文件,没下载整个安装tar包,总包体积在6.5GB左右.

litex-boards测试编译

这里使用litex-hub里的项目测试,Quartus安装到~/intelFPGA_lite/20.1/quartus目录下.注意,加上--load参数项,必须先运行jtagd服务,否则quartus_pgm无法进行jtag烧写.

~$ export PATH=~/riscv64-toolchain/bin:~/intelFPGA_lite/20.1/quartus/bin:$PATH
~$ jtagd --foreground --debug
~$ cd litex-boards/litex_boards
~$ targets/terasic_de0nano.py --uart-name=jtag_uart --build --load

其实最终的编译脚本是如下内容.

linux-on-litex-vexriscv$ cat build/de0nano/gateware/build_de0nano.sh
# Autogenerated by LiteX / git: 55a79030
quartus_map --read_settings_files=on  --write_settings_files=off de0nano -c de0nano
quartus_fit --read_settings_files=off --write_settings_files=off de0nano -c de0nano
quartus_asm --read_settings_files=off --write_settings_files=off de0nano -c de0nano
quartus_sta de0nano -c de0nano
if [ -f "de0nano.sof" ]
then
    quartus_cpf -c de0nano.sof de0nano.rbf
fi

linux-on-litex-vexriscv编译测试

1
2
3

~$ cd linux-on-litex-vexriscv
~$ export PATH=~/riscv64-toolchain/bin:~/intelFPGA_lite/20.1/quartus/bin:$PATH
~$ ./make.py --board=de0nano --build --load

运行--load参数时,需要确保jtagd是运行的,这里如下面所示,最终是使用quartus_pgm -m jtag -c USB-Blaster加载的.

[...]
Info: Command: quartus_pgm -m jtag -c USB-Blaster -o p;/home/michael/workspace-xilinx/RISC-V/litex-hub/litex/linux-on-litex-vexriscv/build/de0nano/gateware/de0nano.sof@1
Info (213046): Using programming cable "USB-Blaster on 127.0.0.1 [3-3]"
Info (213011): Using programming file /home/michael/workspace-xilinx/RISC-V/litex-hub/litex/linux-on-litex-vexriscv/build/de0nano/gateware/de0nano.sof with checksum 0x0085AEAF for device EP4CE22F17@1
Info (209060): Started Programmer operation at Sat Feb 26 11:35:48 2022
Info (209016): Configuring device index 1
Info (209017): Device 1 contains JTAG ID code 0x020F30DD
Info (209007): Configuration succeeded -- 1 device(s) configured
Info (209011): Successfully performed operation(s)
Info (209061): Ended Programmer operation at Sat Feb 26 11:35:49 2022
Info: Quartus Prime Programmer was successful. 0 errors, 0 warnings
    Info: Peak virtual memory: 315 megabytes
    Info: Processing ended: Sat Feb 26 11:35:49 2022
    Info: Elapsed time: 00:00:32
    Info: Total CPU time (on all processors): 00:00:00

通过`JTAG-UART`查看启动信息

这里脚本默认是选择serial进行通信的,上面我在编译时选择了--uart-name=jtag_uart,想测试使用板的上USB JTAG的方式来进行uart通信.

~$ ./targets/terasic_de0nano.py --help
 [...]
 --no-uart
          Disable UART. (default: False)
  --uart-name UART_NAME
          UART type/name. (default: serial)
  --uart-baudrate UART_BAUDRATE
          UART baudrate. (default: 115200)
  --uart-fifo-depth UART_FIFO_DEPTH
          UART FIFO depth. (default: 16)
[...]

新建一个openocd的配置文件,针对de0nano EP4CE22F17的参数设置,如下:

litex_boards$ cat prog/openocd_de0nano.cfg
adapter driver usb_blaster
usb_blaster lowlevel_driver ftdi
set _CHIPNAME EP4CE22F17
set FPGA_TAPID 0x020F30DD
adapter speed 6000
jtag newtap $_CHIPNAME tap -irlen 10 -expected-id $FPGA_TAPID
#scan_chain
gdb_port disabled
tcl_port disabled

连接如下

 ~/.local/bin/litex_term jtag --jtag-config=./prog/openocd_de0nano.cfg
port is 20000
got ir value 2
Open On-Chip Debugger 0.11.0+dev-00562-g5ab74bde0-dirty (2022-02-07-19:44)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Info : only one transport option; autoselect 'jtag'
jtagstream_serve
Info : usb blaster interface using libftdi
Info : This adapter doesn't support configurable speed
Info : JTAG tap: EP4CE22F17.tap tap/device found: 0x020f30dd (mfg: 0x06e (Altera), part: 0x20f3, ver: 0x0)
Warn : gdb services need one or more targets defined

连接到jtag-uart就直接挂住了,没有出来串口终端, 而通过litex_term连接jtag的参数是包装了openocd的命令行,等价与如下命令：

1	~$ openocd -f ./prog/openocd_de0nano.cfg -f stream.cfg -c <....>

因为--build完成后会在当前目录下生成stream.cfg文件,它就是用TCL脚本定义的openocd的配置文件,它的自动生成来源是位于:litex/litex/build/openocd.py里, 片段如下：

litex$ tail -n 20 litex/litex/build/openocd.py
}

proc jtagstream_serve {tap port} {
    set sock [socket stream.server $port]
    $sock readable [list jtagstream_client $tap $sock]
    stdin readable [list jtagstream_exit $sock]
    vwait forever
    $sock close
}
"""
        write_to_file("stream.cfg", cfg)
        print("port is {:d}".format(port))
        print("got ir value {:d}".format(self.get_ir(chain,config)))
        script = "; ".join([
            "init",
            "irscan $_CHIPNAME.tap {:d}".format(self.get_ir(chain, config)),
            "jtagstream_serve $_CHIPNAME.tap {:d}".format(port),
            "exit",
        ])
        self.call(["openocd", "-f", config, "-f", "stream.cfg", "-c", script])

想通过板上的USB接口,连接jtag_uart的方式不成功.

通过串口查看系统启动信息

因为使用jtag_uart方式连接串口不成功,还是选择默认的serial方式连接,因为板上没有USB to UART的功能,看相关文档也没说明如何连接到板上的uart,通过搜索源码发现如下的定义:

litex-boards$ cat litex_boards/platforms/terasic_de0nano.py
[...]
 # Switches
    ("sw", 0, Pins("M1"),  IOStandard("3.3-V LVTTL")),
    ("sw", 1, Pins("T8"),  IOStandard("3.3-V LVTTL")),
    ("sw", 2, Pins("B9"),  IOStandard("3.3-V LVTTL")),
    ("sw", 3, Pins("M15"), IOStandard("3.3-V LVTTL")),

    # Serial
    ("serial", 0,
        # Compatible with cheap FT232 based cables (ex: Gaoominy 6Pin Ftdi Ft232Rl Ft232)
        # GND on JP1 Pin 12.
        Subsignal("tx", Pins("JP1:10"), IOStandard("3.3-V LVTTL")),
        Subsignal("rx", Pins("JP1:8"),  IOStandard("3.3-V LVTTL"))
    ),

    # SDR SDRAM
    ("sdram_clock", 0, Pins("R4"), IOStandard("3.3-V LVTTL")),
[...]

这里使用FTDI 2232H连接如下：

DE0-nano JP1                FT2232H
GPIO_05 Pin8     <------->  AD0 TXD
GPIO_07 Pin10    <------->  AD1 RXD
GND     Pin12    <------->  GND

到这里就可以使用litex_term或者minicom来连接板上串口了,如果出现乱码,就是UART baudrate问题,这里是默认其实是1Mbps(1e6),而且发现在某宝买的很多USB to UART在连接1Mbps还是会出现乱码,不能输入等问题,我换成FTDI 2232H就可以正常使用了.

litex-boards $ minicom -o -b 1000000 -D /dev/ttyUSB0

        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2022 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS CRC passed (1f65f3e6)

 Migen git sha1: ac70301
 LiteX git sha1: 7cc781f7

--=============== SoC ==================--
CPU:            VexRiscv SMP-LINUX @ 50MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            32-bit data
ROM:            64KiB
SRAM:           8KiB
L2:             2KiB
SDRAM:          32768KiB 16-bit @ 50MT/s (CL-2 CWL-2)

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB
   Read: 0x40000000-0x40200000 2.0MiB
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
  Write speed: 15.8MiB/s
   Read speed: 11.7MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
             Timeout
No boot medium found

--============= Console ================--

litex>

可以通过help查看可以支持的命令

litex> help
LiteX BIOS, available commands:

help                     - Print this help
ident                    - Identifier of the system
crc                      - Compute CRC32 of a part of the address space
flush_cpu_dcache         - Flush CPU data cache
flush_l2_cache           - Flush L2 cache
leds                     - Set Leds value

boot                     - Boot from Memory
reboot                   - Reboot
serialboot               - Boot from Serial (SFL)

mem_list                 - List available memory regions
mem_read                 - Read address space
mem_write                - Write address space
mem_copy                 - Copy address space
mem_test                 - Test memory access
mem_speed                - Test memory speed
mem_cmp                  - Compare memory content

sdram_test               - Test SDRAM


litex> ident

Ident: LiteX SoC on DE0-Nano

leds命令可以控制板上的led0~led7的开关,

litex> leds 255  # 全部亮灯

Settings Leds to 0xff
litex> leds 1    # led0 亮

Settings Leds to 0x1

litex>  leds 11  # led0,led1,led3亮, (1 << 0) + (1 << 1) + (1 << 3)

Settings Leds to 0xb

添加`SPI-SDCard`外设

de0-nano板子上没有接sdcard的插槽,这里给它在JP1 Headers上连接一个SPI-SDCard插槽,并且让它能从SDCard boot方式,加载Linux.所以需修改相应的源码,先是在linux-on-litex-vexriscv/make.py里的De0Nano类下面的参数soc_capabilities添加spisdcard,如下：

litex-hub/linux-on-litex-vexriscv$ cat make.py
[...]
class De0Nano(Board):
    soc_kwargs = {"l2_size" : 2048} # Use Wishbone and L2 for memory accesses.
    def __init__(self):
        from litex_boards.targets import de0nano
        Board.__init__(self, de0nano.BaseSoC, soc_capabilities={
            # Communication
            "serial",
            "spisdcard"
        }, bitstream_ext=".sof")
[...]

再去到litex-hub/litex-boards的项目下,添加如下补丁修改

litex-hub/litex-boards$ git diff litex_boards/platforms/terasic_de0nano.py
diff --git a/litex_boards/platforms/terasic_de0nano.py b/litex_boards/platforms/terasic_de0nano.py
index 3284ddc..7a810f7 100644
--- a/litex_boards/platforms/terasic_de0nano.py
+++ b/litex_boards/platforms/terasic_de0nano.py
@@ -115,6 +115,14 @@ _io = [
         "F15 F16 F14 G16 G15"),
         IOStandard("3.3-V LVTTL")
     ),
+    # SDCard
+    ("spisdcard", 0,
+        Subsignal("clk",  Pins("JP1:18")),
+        Subsignal("cs_n", Pins("JP1:20")),
+        Subsignal("mosi", Pins("JP1:14"), Misc("WEAK_PULL_UP_RESISTOR ON")),
+        Subsignal("miso", Pins("JP1:16"), Misc("WEAK_PULL_UP_RESISTOR ON")),
+        IOStandard("3.3-V LVTTL")
+    ),
 ]

 # Connectors ---------------------------------------------------------------------------------------

编译linux-on-litex-vexriscv

1	linux-on-litex-vexriscv$ ./make.py --board=de0nano --build

完成后,先把一张SD卡格式成用fdisk修改分区类型为W95 FAT32,再用mkfs.fat格化它.再把linux-on-litex-vexriscv/images里面的文件复到SD卡的根目录里.SPI-SDCard模块与JP1接线如下：

SPI-Card Module              de0-nano JP1  FPGA Pin No
    CS         <---------->    JP1:20      GPIO_015 C6
    CLK        <---------->    JP1:18      GPIO_013 D6
    SDO        <---------->    JP1:16      GPIO_011 A6
    SDI        <---------->    JP1:14      GPIO_09  D5
    GND        <---------->    GND
    3V3        <---------->    3V3

上面的接线可以参考litex_boards/platforms/terasic_de0nano.py里的代码与上面介绍里GPIO-0 Pin Assignments的描述去理解与开发.

1	linux-on-litex-vexriscv$ ./make.py --board=de0nano --load

串口连接,并从SDCard booting

linux-on-litex-vexriscv$ minicom -o -b 1000000 -D /dev/ttyUSB0

(c) Copyright 2012-2022 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS CRC passed (1038d38c)

 Migen git sha1: ac70301
 LiteX git sha1: 7f49c523

--=============== SoC ==================--
CPU:            VexRiscv SMP-LINUX @ 50MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            32-bit data
ROM:            64KiB
SRAM:           8KiB
L2:             2KiB
SDRAM:          32768KiB 16-bit @ 50MT/s (CL-2 CWL-2)

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB
   Read: 0x40000000-0x40200000 2.0MiB
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
  Write speed: 16.7MiB/s
   Read speed: 20.3MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
             Timeout
Booting from SDCard in SPI-Mode...
Booting from boot.json...
Copying Image to 0x40000000 (7531468 bytes)...
[########################################]
Copying rv32.dtb to 0x40ef0000 (2621 bytes)...
[########################################]
Copying rootfs.cpio to 0x41000000 (3786240 bytes)...
[########################################]
Copying opensbi.bin to 0x40f00000 (53640 bytes)...
[########################################]
Executing booted program at 0x40f00000
[...]

如果从SDCard启动失败,先确保卡的分区格式是W95 FAT32,再换一张卡测试一下,因为我这边使用一张512MB的旧卡,另一张是1GB的旧卡,都无法检测到,换了一张32GB,128GB卡,都能成功加载运行.好像enjoy-digital/litesdcard对旧卡兼容有问题,或者是其它未知的原因.

quartus_cpf命令

查看参数帮助说明,如：

~$ quartus_cpf --help=rpd

Topic: rpd

To generate a Raw Programming Data File (.rpd), specify the input
file name and output file name. Make sure the file extension
of the output file is .rpd. The input file can be only a
Programmer Object File (.pof).

---------
Examples:
---------

# To convert .pof to .rpd
quartus_cpf -c <input_pof_file> <output_rpd_file>

# To use a Conversion Setup File (.cof) created with
# the Convert Programming Files dialog box in the UI
quartus_cpf -c <input_cof_file>

查看sof文件的信息

~$ quartus_cpf --info de0nano.sof
                File: de0nano.sof
            File CRC: 0x24F3
             Creator: Quartus Prime Compiler Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
             Comment: Untitled
              Device: EP4CE22F17
       Data checksum: 0x008595BA
       JTAG usercode: 0x008595BA
        Project Hash: 0x

生成svf文件

1	~$ quartus_cpf -c -q 6.0MHz -g 3.3 -n p de0nano.sof de0nano.svf

生成rpd文件

1 2	~$ quartus_cpf -c -d EPCS64 de0nano.sof de0nano.pof ~$ quartus_cpf -c -d EPCS64 -s EP4CE22F17 de0nano.pof de0nano.rpd

生成jic文件,可以使用Quartus Prime IDE -> Tools -> Programmer -> Add File...进行烧写,需确保jtagd服务是运行的.
1
~$ quartus_cpf -c -d EPCS64 -s EP4CE22F17 de0nano.sof de0nano.jic

用`openocd`加载`svf`文件.

根据板子参数，创建一个openocd的连接配置文件.

~$ cat > openocd_de0nano.cfg <<EOF
adapter driver usb_blaster
usb_blaster lowlevel_driver ftdi
set CHIPNAME EP4CE22F17
set FPGA_TAPID 0x020F30DD # 通过jtagconfig取得
adapter speed 6000
jtag newtap $CHIPNAME tap -irlen 10 -expected-id $FPGA_TAPID
init
scan_chain

EOF

加载到FPGA.

~$ openocd -f ./openocd_de0nano.cfg -c "svf  de0nano.svf progress" -c exit
Open On-Chip Debugger 0.11.0+dev-00562-g5ab74bde0-dirty (2022-02-07-19:44)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Info : only one transport option; autoselect 'jtag'
Info : usb blaster interface using libftdi
Info : This adapter doesn't support configurable speed
Info : JTAG tap: EP4CE22F17.tap tap/device found: 0x020f30dd (mfg: 0x06e (Altera), part: 0x20f3, ver: 0x0)
Warn : gdb services need one or more targets defined
   TapName             Enabled  IdCode     Expected   IrLen IrCap IrMask
-- ------------------- -------- ---------- ---------- ----- ----- ------
 0 EP4CE22F17.tap         Y     0x020f30dd 0x020f30dd    10 0x01  0x03

svf processing file: "de0nano.svf"
  0%  FREQUENCY 1.20E+07 HZ;
Error: Translation from khz to adapter speed not implemented

  0%  TRST ABSENT;
  0%  ENDDR IDLE;
  0%  ENDIR IRPAUSE;
  0%  STATE IDLE;
  0%  SIR 10 TDI (002);
  0%  RUNTEST IDLE 12000 TCK ENDSTATE IDLE;
 95%  	FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF);
 95%  SIR 10 TDI (004);
 95%  RUNTEST 60 TCK;
 95%  	000000000000000000000000000000000000000000000000000000000000000000);
 95%  SIR 10 TDI (003);
 95%  RUNTEST 49152 TCK;
 95%  RUNTEST 512 TCK;
 95%  SIR 10 TDI (3FF);
 95%  RUNTEST 12000 TCK;
 95%  STATE IDLE;

Time used: 0m1s439ms
svf file programmed successfully for 17 commands with 0 errors

烧写到`SPI FLASH`

openFPGALoader Intel/Altera
这里使用openFPGALoader -b de0nano -f de0nano.rpd显示烧写错误:flash stackflow,后面使用下面的命令就可以正常烧写到Flash.

1	~$ openFPGALoader -c usb-blaster --fpga-part ep4ce2217 -f de0nano.rbf

UrJtag使用

直接使用apt-get install urjtag的版本较老,是不支持ep4c22,显示如下：

~$ jtag
jtag> cable UsbBlaster vid=0x09fb pid=0x6001 interface=0
Connected to libftdi driver.
jtag> detect
IR length: 10
Chain length: 1
Device Id: 00000010000011110011000011011101 (0x020F30DD)
  Manufacturer: Altera (0x0DD)
  Unknown part! (0010000011110011) (/usr/share/urjtag/altera/PARTS)

通过参照这里,从最新(urjtag-2021.03)源码去编译它,这里还需要去FTDI的官网去下载D2XX Drivers
下载D2XX Drivers

~$ wget -c https://ftdichip.com/wp-content/uploads/2021/09/libftd2xx-x86_64-1.4.24.tgz
~$ tar xvf libftd2xx-x86_64-1.4.24.tgz
release/
release/release-notes.txt
release/WinTypes.h
[...]

编译安装urjtag-2021.03

~$ wget -c https://sourceforge.net/projects/urjtag/files/urjtag/2021.03/urjtag-2021.03.tar.xz/download
~$ tar xvf urjtag-2021.03.tar.xz
~$ cd urjtag-2021.03
~4 CLFAGS=-I$PWD/../release LDFLAGS="-L$PWD/../release/build -lftd2xx" ./configure --with-libusb  --with-libftdi  --with-ftd2xx
[...]
  Libraries:
    libusb     : 1.0
    libftdi    : yes (have async mode)
    libftd2xx  : yes
    inpout32   : no

  Subsystems:
    SVF        : yes
    BSDL       : yes
    STAPL      : no

  Drivers:
    Bus        : ahbjtag arm9tdmi au1500 avr32 bcm1250 blackfin bscoach ejtag ejtag_dma fjmem ixp425 ixp435 ixp465 jopcyc h7202 lh7a400 mpc5200 mpc824x mpc8313 mpc837x ppc405ep ppc440gx_ebc8 prototype pxa2x0 pxa27x s3c4510 sa1110 sh7727 sh7750r sh7751r sharc_21065L sharc_21369_ezkit slsup3 tx4925 zefant_xs3
    Cable      : arcom byteblaster dirtyjtag dlc5 ea253 ei012 ft2232 gpio ice100 igloo jlink keithkoep lattice mpcbdm triton usbblaster vsllink wiggler xpc
    Lowlevel   : direct ftdi ftd2xx ppdev

  Language bindings:
    python     : yes
~$ make && make install

应用ep4ce22描述文件的补丁

~$ cd /usr/local/share/urjtag$
~$ sudo patch -p1 < ~/urjtag-descriptors.patch
patching file altera/ep4ce22/ep4ce22
patching file altera/ep4ce22/STEPPINGS
patching file altera/PARTS
Hunk #1 succeeded at 28 (offset 2 lines).
michael@debian:/usr/local/share/urjtag$ /usr/local/bin/jtag

补丁文件

~$ cat urjtag-descriptors.patch
diff -Naur urjtag-orig/altera/ep4ce22/ep4ce22 urjtag/altera/ep4ce22/ep4ce22
--- urjtag-orig/altera/ep4ce22/ep4ce22	1970-01-01 10:00:00.000000000 +1000
+++ urjtag/altera/ep4ce22/ep4ce22	2014-07-30 21:48:09.652857260 +1000
@@ -0,0 +1,12 @@
+instruction length 10
+register DIR 32
+register USERCODE 32
+register BSR 732
+register BYPASS 1
+instruction HIGHZ 0000001011 BYPASS
+instruction CLAMP 0000001010 BYPASS
+instruction USERCODE 0000000111 USERCODE
+instruction IDCODE 0000000110 DIR
+instruction SAMPLE/PRELOAD 0000000101 BSR
+instruction EXTEST 0000001111 BSR
+instruction BYPASS 1111111111 BYPASS
diff -Naur urjtag-orig/altera/ep4ce22/STEPPINGS urjtag/altera/ep4ce22/STEPPINGS
--- urjtag-orig/altera/ep4ce22/STEPPINGS	1970-01-01 10:00:00.000000000 +1000
+++ urjtag/altera/ep4ce22/STEPPINGS	2014-07-30 21:48:09.644857260 +1000
@@ -0,0 +1,23 @@
+#
+# $Id: STEPPINGS 897 2007-12-29 13:02:32Z arniml $
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# of the License, or (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+# 02111-1307, USA.
+#
+# Written by H Hartley Sweeten <hsweeten@visionengravers.com>
+#
+
+# bits 31-28 of the Device Identification Register
+0000	ep4ce22		0
diff -Naur urjtag-orig/altera/PARTS urjtag/altera/PARTS
--- urjtag-orig/altera/PARTS	2014-07-28 22:19:56.968449502 +1000
+++ urjtag/altera/PARTS	2014-07-30 21:48:08.464857263 +1000
@@ -26,3 +26,4 @@
 0111000100101000	epm7128aetc100		EPM7128AETC100
 0111000001100100	epm3064a		EPM3064A
 0010000010110010	ep2c8			EP2C8
+0010000011110011	ep4ce22			EP4CE22

运行新版UrJtag

~$ /usr/local/bin/jtag

UrJTAG 2021.03 #
Copyright (C) 2002, 2003 ETC s.r.o.
Copyright (C) 2007, 2008, 2009 Kolja Waschk and the respective authors

UrJTAG is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
There is absolutely no warranty for UrJTAG.

warning: UrJTAG may damage your hardware!
Type "quit" to exit, "help" for help.

jtag> cable UsbBlaster vid=0x09fb pid=0x6001 interface=0
Connected to libftdi driver.
jtag> detect
IR length: 10
Chain length: 1
Device Id: 00000010000011110011000011011101 (0x020F30DD)
  Manufacturer: Altera (0x0DD)
  Part(0):      EP4CE22 (0x20F3)
  Stepping:     0
  Filename:     /usr/local/share/urjtag/altera/ep4ce22/ep4ce22
jtag> cable usbblaster driver=ftdi
Connected to libftdi driver.
jtag> detect
IR length: 10
Chain length: 1
Device Id: 00000010000011110011000011011101 (0x020F30DD)
  Manufacturer: Altera (0x0DD)
  Part(0):      EP4CE22 (0x20F3)
  Stepping:     0
  Filename:     /usr/local/share/urjtag/altera/ep4ce22/ep4ce22
jtag> print chain
 No. Manufacturer              Part                 Stepping Instruction          Register
-------------------------------------------------------------------------------------------------------------------
*  0 Altera                    EP4CE22              0        BYPASS               BYPASS

其它项目

open-design/riscv-soc-cores
ikwzm/FPGA-SoC-Linux, 下一步实践.
fpgarduino需要实践一下.
olofk/serv SERV - the SErial RISC-V CPU by olofk
Dmitriy0111/nanoFOX
openFPGALoader通用烧写工具
Extracting firmware from devices using JTAG

`USB Blaster`连接问题

Altera Design Software
因为这里只想使用quartus_pgm命令,就只下载了QuartusProgrammerSetup-16.1.0.196-linux.run.

~$ jtagd --foreground --debug

~$ ./jtagd --user-start --foreground
~$ ./jtagconfig
Error (Server error) when scanning hardware

测看系统日志

~$ dmesg
[...]
[25811.819181] usb 4-2: USB disconnect, device number 16
[25814.375520] usb 4-2: new full-speed USB device number 17 using xhci_hcd
[25814.550270] usb 4-2: New USB device found, idVendor=09fb, idProduct=6001, bcdDevice= 4.00
[25814.550283] usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[25814.550289] usb 4-2: Product: USB-Blaster
[25814.550293] usb 4-2: Manufacturer: Altera
[25814.550297] usb 4-2: SerialNumber: 91d28408

直接运行jtagconfig命令,就出现下面这个错.

1 2	~$ ./jtagconfig Error when scanning hardware - Server error

然后用strace运行只过滤查看network运行情况如下：

~$ strace -e trace=network jtagconfig
[...]
si_stime=0} ---
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(3, SOL_SOCKET, SO_LINGER, {l_onoff=1, l_linger=10}, 8) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(1309), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(4, SOL_SOCKET, SO_LINGER, {l_onoff=1, l_linger=10}, 8) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(1309), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
recvfrom(3, "", 2, 0, NULL, NULL)       = 0
getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
recvfrom(4, "", 2, 0, NULL, NULL)       = 0
recvfrom(-1, 0x1e4ca0c, 2, 0, NULL, NULL) = -1 EBADF (Bad file descriptor)

为排除硬件问题,又没有第二台电脑系统可试.使用Virtualbox安装了一个同版本的系统测试,发现在虚拟机里不做任何设置,就可以正常发现设备.如下:
1
2
3
4
~$ ./jtagconfig
1) USB-Blaster [2-2]
Unable to read device chain - JTAG chain broken
再次按装官方文档,安装添加udev相关设置,再把jtagd开启调试模式如下：

~# cat>/etc/udev/rules.d/51-altera-usb-blaster.rules<<EOF
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6001", MODE="0666"
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6002", MODE="0666"
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6003", MODE="0666"
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6010", MODE="0666"
SUBSYSTEM=="usb", ATTR{idVendor}=="09fb", ATTR{idProduct}=="6810", MODE="0666"
EOF

~$ jtagd --foreground --debug
JTAG daemon started
Using config file /etc/jtagd/jtagd.conf
Remote JTAG permitted when password set
USB-Blaster "USB-Blaster" firmware version 4.00
USB-Blaster endpoints out=02(64), in=81(64); urb size=1024
USB-Blaster added "USB-Blaster [4-2]"
USB-Blaster port (/dev/bus/usb/004/017) opened

但是直接运行jtagconfig还是会报Error when scanning hardware - Server error错误.后面按照上述文档进行下面的设置就可以了.

`jtagd`服务端配置

~# cp /fullpath/intelFPGA_lite/16.1/qprogrammer/linux64/pgm_parts.txt /etc/jtagd/jtagd.pgm_parts
~# echo "Password = \"123456\";" > /etc/jtagd/jtagd.conf
~# killall -9 jtagd
~$ jtagd --foreground --debug
JTAG daemon started
Using config file /etc/jtagd/jtagd.conf
Remote JTAG permitted when password set
USB-Blaster "USB-Blaster" firmware version 4.00
USB-Blaster endpoints out=02(64), in=81(64); urb size=1024
USB-Blaster added "USB-Blaster [6-2]"
USB-Blaster port (/dev/bus/usb/006/002) opened
USB-Blaster "USB-Blaster" firmware version 4.00
USB-Blaster endpoints out=02(64), in=81(64); urb size=1024
USB-Blaster reports JTAG protocol version 0, using version 0

`jtagconfig`配置

~$ jtagconfig --addserver 127.0.0.1 123456
~$ jtagconfig
1) USB-Blaster on 127.0.0.1 [6-2]
  020F30DD   EP3C25/EP4CE22

谢谢支持

微信二维码:

机器学习(ML)环境的搭建

发表于 2021-01-13 更新于 2024-07-30

机器学习框架介绍

Keras vs PyTorch：谁是「第一」深度学习框架？

CUDA

NVIDIA CUDA Toolkit Release Notes
Linux 安装指导
 Nvidia驱动优化
 Debian Nvidia安装配置

本机系统

~$ nvidia-debugdump -l
Found 1 NVIDIA devices
	Device ID:              0
	Device name:            Quadro P600   (*PrimaryCard)
	GPU internal ID:        0422018092726

~$ cat /etc/*release
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18+deb9u1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)

安装`CUDA`

根据它官方的指导选择安装适合版本.

Table 1. CUDA Toolkit and Compatible Driver Versions

CUDA Toolkit	Linux x86_64 Driver Version	Windows x86_64 Driver Version
CUDA 10.0.130	>= 410.48	>= 411.31
CUDA 9.2 (9.2.148 Update 1)	>= 396.37	>= 398.26
CUDA 9.2 (9.2.88)	>= 396.26	>= 397.44
CUDA 9.1 (9.1.85)	>= 390.46	>= 391.29
CUDA 9.0 (9.0.76)	>= 384.81	>= 385.54
CUDA 8.0 (8.0.61 GA2)	>= 375.26	>= 376.51
CUDA 8.0 (8.0.44)	>= 367.48	>= 369.30
CUDA 7.5 (7.5.16)	>= 352.31	>= 353.66
CUDA 7.0 (7.0.28)	>= 346.46	>= 347.62

下载最低要求版本的Nvidia官方驱动.NVIDIA-Linux-x86_64-410.78.run
卸载系统原来安装的驱动.

1	~$ dpkg -l \| grep "nvidia" \| awk '{print $2}' \| xargs sudo dpkg --purge

下载最新版本的CUDA工具包.cuda_10.0.130_410.48_linux.run,或者安装cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
安装步骤,先安装显卡驱动,再安装NVIDIA CUDA Toolkit,如果是安装cuda_10.0.130_410.48_linux.run,安装过程中注意交互选项.

测试安装

~$ nvidia-smi
Fri Nov 23 11:00:29 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P600         Off  | 00000000:01:00.0  On |                  N/A |
| 34%   31C    P8    N/A /  N/A |    401MiB /  1999MiB |      4%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       898      G   /usr/lib/xorg/Xorg                           156MiB |
|    0     13787      G   ...uest-channel-token=15869920746181936845    96MiB |
|    0     16550      G   ...-token=D890EF91A7BB8E03F6D8D7795CC12E48   145MiB |
+-----------------------------------------------------------------------------+

安装相应版本的cuDNN库,cudnn-10.0-linux-x64-v7.4.1.5.tgz

安装相应版本的NCCL.

1	~$ tar xvf cudnn-10.0-linux-x64-v7.4.1.5.tgz -C /usr/local

Debian Buster下安装`Nvidia-tesla`驱动

~$ apt-get install nvidia-tesla-460-kernel-dkms nvidia-tesla-460-driver libnvidia-tesla-460-cuda1 nvidia-xconfig nvidia-tesla-460-smi
~$ nvidia-smi
Sat Mar 27 21:01:55 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GT 1030     On   | 00000000:05:00.0  On |                  N/A |
| N/A   38C    P5    N/A /  30W |    449MiB /  2000MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      3365      G   /usr/lib/xorg/Xorg                326MiB |
|    0   N/A  N/A     10847      G   ...AAAAAAAAA= --shared-files       22MiB |
|    0   N/A  N/A     15542      G   ...chael/firefox/firefox-bin        0MiB |
|    0   N/A  N/A     16429      G   ...AAAAAAAA== --shared-files       97MiB |
+-----------------------------------------------------------------------------+

~$ sudo nvidia-xconfig  # 创建/etc/X11/xorg.conf , 手动修改会导致不能启动图形界面，或者使用vdpauinfo.

安装`dkms`驱动出错

/var/lib/dkms/nvidia-tesla-460/460.91.03/build/common/inc/nv-misc.h:20:12: fatal error: stddef.h: No such file or directory
514    20 |   #include <stddef.h>     // NULL
515       |            ^~~~~~~~~~

上面错误,是没有包含/usr/src/<linux-5.17-SRC>/include/linux的头文件，下面是一个驱动的patch集合,需要把下面这个文件加入到/usr/src/nvidia-tesla-460-460.91.03/dkms.conf内。

~$ cat nvidia-tesla-460-460.91.03/patches/nvidia-tesla-460-linux-5.17-combind.patch
diff -u a/Kbuild  b/Kbuild
--- a/Kbuild	2021-07-02 14:04:57.000000000 +0800
+++ a/Kbuild	2022-05-15 12:38:09.968486119 +0800
@@ -68,7 +68,7 @@

 EXTRA_CFLAGS += -I$(src)/common/inc
 EXTRA_CFLAGS += -I$(src)
-EXTRA_CFLAGS += -Wall -MD $(DEFINES) $(INCLUDES) -Wno-cast-qual -Wno-error -Wno-format-extra-args
+EXTRA_CFLAGS += -Wall -MD $(DEFINES) $(INCLUDES) -Wno-cast-qual -Wno-error -Wno-format-extra-args -I./include/linux
 EXTRA_CFLAGS += -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"460.91.03\" -Wno-unused-function -Wuninitialized -fno-strict-aliasing -mno-red-zone -mcmodel=kernel -DNV_UVM_ENABLE
 EXTRA_CFLAGS += $(call cc-option,-Werror=undef,)
 EXTRA_CFLAGS += -DNV_SPECTRE_V2=$(NV_SPECTRE_V2)

diff -ruN a/nvidia-uvm/uvm_linux.h b/nvidia-uvm/uvm_linux.h
--- a/nvidia-uvm/uvm_linux.h	2021-07-02 14:07:31.000000000 +0800
+++ b/nvidia-uvm/uvm_linux.h	2021-09-04 00:24:32.426673346 +0800
@@ -485,7 +485,7 @@
 #elif (NV_WAIT_ON_BIT_LOCK_ARGUMENT_COUNT == 4)
     static __sched int uvm_bit_wait(void *word)
     {
-        if (signal_pending_state(current->state, current))
+        if (signal_pending_state(current->__state, current))
             return 1;
         schedule();
         return 0;

diff -u nvidia-tesla-460-460.91.03{,.old}/nvidia-drm/nvidia-drm-format.c
--- nvidia-tesla-460-460.91.03/nvidia-drm/nvidia-drm-format.c	2021-07-02 14:07:31.000000000 +0800
+++ nvidia-tesla-460-460.91.03.old/nvidia-drm/nvidia-drm-format.c	2022-05-15 15:17:23.498152286 +0800
@@ -29,6 +29,7 @@
 #endif
 #include <linux/kernel.h>

+#include "nvidia-uvm/uvm_linux.h"
 #include "nvidia-drm-format.h"
 #include "nvidia-drm-os-interface.h"

diff -u nvidia-tesla-460-460.91.03/common/inc/nv-procfs.h nvidia-tesla-460-460.91.03.old/common/inc/nv-procfs.h
--- nvidia-tesla-460-460.91.03/common/inc/nv-procfs.h	2021-07-02 14:07:32.000000000 +0800
+++ nvidia-tesla-460-460.91.03.old/common/inc/nv-procfs.h	2022-05-15 15:52:20.475063183 +0800
@@ -11,6 +11,11 @@
 #define _NV_PROCFS_H

 #include "conftest.h"
+#include <linux/version.h>
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(5,17,0))
+#define NV_PDE_DATA_PRESENT
+#define PDE_DATA(inode) pde_data(inode)
+#endif

 #ifdef CONFIG_PROC_FS
 #include <linux/proc_fs.h>
diff --git a/common/inc/nv-time.h b/common/inc/nv-time.h
index dc80806..cc343a5 100644
--- a/common/inc/nv-time.h
+++ b/common/inc/nv-time.h
@@ -23,6 +23,7 @@
 #ifndef __NV_TIME_H__
 #define __NV_TIME_H__

+#include <linux/version.h>
 #include "conftest.h"
 #include <linux/sched.h>
 #include <linux/delay.h>
@@ -205,7 +206,12 @@ static inline NV_STATUS nv_sleep_ms(unsigned int ms)
         // the requested timeout has expired, loop until less
         // than a jiffie of the desired delay remains.
         //
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(5, 14, 0))
         current->state = TASK_INTERRUPTIBLE;
+#else
+        // Rel. commit "sched: Change task_struct::state" (Peter Zijlstra, Jun 11 2021)
+        WRITE_ONCE(current->__state, TASK_INTERRUPTIBLE);
+#endif
         do
         {
             schedule_timeout(jiffies);

diff --git a/nvidia-drm/nvidia-drm-drv.c b/nvidia-drm/nvidia-drm-drv.c
index 84d4479..99ea552 100644
--- a/nvidia-drm/nvidia-drm-drv.c
+++ b/nvidia-drm/nvidia-drm-drv.c
@@ -20,6 +20,7 @@
  * DEALINGS IN THE SOFTWARE.
  */

+#include <linux/version.h>
 #include "nvidia-drm-conftest.h" /* NV_DRM_AVAILABLE and NV_DRM_DRM_GEM_H_PRESENT */

 #include "nvidia-drm-priv.h"
@@ -903,9 +904,12 @@ static void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)

     dev->dev_private = nv_dev;
     nv_dev->dev = dev;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(5, 14, 0))
+    // Rel. commit "drm: Remove pdev field from struct drm_device" (Thomas Zimmermann, 3 May 2021)
     if (device->bus == &pci_bus_type) {
         dev->pdev = to_pci_dev(device);
     }
+#endif

     /* Register DRM device to DRM sub-system */

安装 PyCuda

Pip install error with PyCUDA

1 2	~$ export PATH=/usr/local/cuda-10.0/bin:$PATH ~$ pip install pycuda

`Nvidia Docker`安装

nvidia-docker

这里按照上面链接的文档安装如下.

# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker

# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd #　如果之前有安装过Docker,这一步很重要,停掉之前的进程.

用`Docker`安装`Tensorflow GPU`

~$ docker pull tensorflow/tensorflow:latest-gpu \
~$ nvidia-docker run -it --rm tensorflow/tensorflow:latest-gpu    python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
2018-11-23 03:45:33.118841: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-11-23 03:45:33.196995: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node,so returning NUMA node zero
2018-11-23 03:45:33.197568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: Quadro P600 major: 6 minor: 1 memoryClockRate(GHz): 1.5565
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.57GiB
[...]

`Nvidia GPU CLOUD`容器云

安装与运行`Caffe2`容器,

从 caffe2.ai 安装指南,

~$ docker run --runtime=nvidia -it caffe2ai/caffe2:latest python -m caffe2.python.operator_test.relu_op_test
Trying example: test_relu(self=<__main__.TestRelu testMethod=test_relu>, X=array([[[-0.42894608],
        [-0.65820682],
        [ 0.39978197],
		[...]

从 Nvidia GPU CLOUD (NGC) 上安装

1
2
3

~$ docker pull nvcr.io/nvidia/caffe2:18.08-py3
# 运行测试.
~$ nvidia-docker run --runtime=nvidia -it nvcr.io/nvidia/caffe2:18.08-py3  python -m caffe2.python.operator_test.relu_op_test

下面在Docker里运行jupyter notebook的实例,把它在docker里的8888端口映射到宿主机的9999端口,通过宿主机的浏览器,输入http://127.0.0.1:9999/可以访问到它.
--rm 当容器关闭后删除它
--it 以交互模式运行
-v　映射宿主机的目录到容器里.如上文就是把宿主机的/data/AI-DIR/TensorFlow/jupyter-notebook,映射到容器里的/data/jupyter目录.

~$ nvidia-docker run --rm  --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -v /data/AI-DIR/TensorFlow/jupyter-notebook:/data/jupyter  -it-p 9999:8888 nvcr.io/nvidia/caffe2:18.08-py3  sh -c "jupyter notebook --no-browser --allow-root --ip 0.0.0.0 /data/jupyter"

============
== Caffe2 ==
============

NVIDIA Release 18.08 (build 599137)

Container image Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
[...]

安装`PyTorch`

PyTorch 安装

1	~$ docker pull nvcr.io/nvidia/pytorch:18.11-py3

从源码安装`PyTorch`

~$ pyenv activate  py3dev  # 通过　pyenv　进入　Python 3.6的虚拟环境.
~$ pip install numpy pyyaml mkl mkl-include setuptools cmake cffi typing # py3dev　环境中安装依赖库.
~$ export PATH=/usr/local/cuda-10.0/bin:$PATH
~$ export CUDA=1
~$ pip install pycuda
~$ git clone --recursive https://github.com/pytorch/pytorch
~$ cd pytorch
~$ python setup.py install

如果在使用numpy中出现libmkl_rt.so: cannot open shared object file: No such file or directory错误,要安装libmkl_rt再重新安装numpy的库.
现在caffe2已经进入到PyTorch源码里如果导入下面模块时,出现**ModuleNotFoundError: No module named ‘past’**错误,要先安装依赖pip install future

import matplotlib.pyplot as plt
import numpy as np
import os
import shutil
import caffe2.python.predictor.predictor_exporter as pe
from caffe2.python import core,model_helper,net_drawer,workspace,visualize,brew
ModuleNotFoundError: No module named 'past'

警告net_drawer will not run correctly. Please install the correct dependencies.,该警告是因没有安装pydot.安装pip install pydot.

源码安装`TensorFlow (支持 CUDA 10)`

安装`Bazel`

~$ echo 'deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8' | sudo tee /etc/apt/sources.list.d/bazel.list
~$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
~$ sudo apt-get update
~$ sudo apt-get install bazel

因为墙的原因,可能上述安装会很慢,可以直接从https://github.com/bazelbuild/bazel/releases一个安装脚本.现在如果使用apt-get install bazel会安装最新的0.20.0版本,但是现在的TensorFlow 1.12.0只支持bazel 0.19.2的版本编译.

$ bazel version
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Build label: 0.19.2
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Mon Nov 19 16:25:09 2018 (1542644709)
Build timestamp: 1542644709
Build timestamp as int: 1542644709

下载`TensorFlow`源码

~$ export PATH=/usr/local/cuda/bin:$PATH
~$ export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:/usr/local/cuda-10.0/extras/CUPTI/lib64:$LD_LIBRARY_PATH
~$ git clone https://github.com/tensorflow/tensorflow.git
~$ pyenv activate py3dev
~$ pip install wheel
~$ ./configure
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.19.2 installed.
Please specify the location of python. [Default is fullpath/.pyenv/versions/py3dev/bin/python]:

Found possible Python library paths:
  /fullpath/.pyenv/versions/py3dev/lib/python3.6/site-packages
Please input the desired Python library path to use.  Default is [/fullpath/.pyenv/versions/py3dev/lib/python3.6/site-packages]
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: Y
XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: N
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 10.0
Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]:
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Do you wish to build TensorFlow with TensorRT support? [y/N]: N
No TensorRT support will be enabled for TensorFlow.

Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]:

Do you want to use clang as CUDA compiler? [y/N]: N
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]: N
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: y
Please specify the home path of the Android NDK to use. [Default is /fullpath/Android/Sdk/ndk-bundle]:
Please specify the home path of the Android SDK to use. [Default is /fullpath/Android/Sdk]:
Please specify the Android SDK API level to use. [Available levels: ['13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28']] [Default is 28]:

Please specify an Android build tools version to use. [Available versions: ['21.1.2', '23.0.3', '24.0.3', '25.0.0', '25.0.2', '25.0.3', '26.0.2', '27.0.0', '27.0.3', '28.0.0-rc2', '28.0.2', '28.0.3']] [Default is 28.0.3]:

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
        --config=mkl            # Build with MKL support.
        --config=monolithic     # Config for mostly static monolithic build.
        --config=gdr            # Build with GDR support.
        --config=verbs          # Build with libverbs support.
        --config=ngraph         # Build with Intel nGraph support.
        --config=dynamic_kernels        # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
        --config=noaws          # Disable AWS S3 filesystem support.
        --config=nogcp          # Disable GCP support.
        --config=nohdfs         # Disable HDFS support.
        --config=noignite       # Disable Apacha Ignite support.
        --config=nokafka        # Disable Apache Kafka support.
        --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

~$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

~$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg # 编译Python安装包.
Wed Dec 5 10:54:27 CST 2018 : === Preparing sources in dir: /tmp/tmp.OZdwkuc2YO
~/github/tensorflow ~/github/tensorflow
[...]

~$ pip install /tmp/tensorflow_pkg/tensorflow-1.12.0rc0-cp36-cp36m-linux_x86_64.whl # 安装Python
~$ LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda-10.0/extras/CUPTI/lib64:$LD_LIBRARY_PATH jupyter notebook # 在notebook里测试运行tensorflow

如果出现下面failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error这个错误,要在终端里先运行apt-get install nvidia-modprobe这个命令,并且重启系统.

In [1]: import tensorflow as tf
In [2]: tf.test.is_built_with_cuda()
Out[2]: True
In [3]: tf.test.is_gpu_available(cuda_only=False,min_cuda_compute_capability=None)
2018-12-05 12:03:06.128401: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2018-12-05 12:03:06.128442: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:161] retrieving CUDA diagnostic information for host: debian
2018-12-05 12:03:06.128448: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:168] hostname: debian
2018-12-05 12:03:06.128470: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:192] libcuda reported version is: 410.48.0
2018-12-05 12:03:06.128488: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:196] kernel reported version is: 410.48.0
2018-12-05 12:03:06.128493: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:303] kernel version seems to match DSO: 410.48.0
Out[3]: False

In [4]:  tf.test.is_gpu_available(cuda_only=False,min_cuda_compute_capability=None)
Out[4]: False

#　重启系统之后,可以正常使用GPU了.

In [1]: import tensorflow as tf

In [2]: tf.Session().list_devices()
2018-12-05 15:02:22.981018: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-05 15:02:22.982813: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55d255632230 executing computations on platform CUDA. Devices:
2018-12-05 15:02:22.982835: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): Quadro P600, Compute Capability 6.1
2018-12-05 15:02:22.983889: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1431] Found device 0 with properties:
name: Quadro P600 major: 6 minor: 1 memoryClockRate(GHz): 1.5565
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.74GiB
2018-12-05 15:02:22.983931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Adding visible gpu devices: 0
2018-12-05 15:02:22.986678: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-05 15:02:22.986711: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2018-12-05 15:02:22.986726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2018-12-05 15:02:22.986953: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1113] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1560 MB memory) -> physical GPU (device: 0, name: Quadro P600, pci bus id: 0000:01:00.0, compute capability: 6.1)
Out[2]:
[_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456, 4411150611837152607),
 _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 8331037032149977949),
 _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_GPU:0, XLA_GPU, 17179869184, 1279689307458374322),
 _DeviceAttributes(/job:localhost/replica:0/task:0/device:GPU:0, GPU, 1636106240, 7170667474598106347)]

Out[3]: tf.test.is_gpu_available(cuda_only=False,min_cuda_compute_capability=None)
2018-12-05 15:05:52.037618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Adding visible gpu devices: 0
2018-12-05 15:05:52.037647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-05 15:05:52.037652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2018-12-05 15:05:52.037656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2018-12-05 15:05:52.037737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1113] Created TensorFlow device (/device:GPU:0 with 1560 MB memory) -> physical GPU (device: 0, name: Quadro P600, pci bus id: 0000:01:00.0, compute capability: 6.1)
Out[4]: True

运行`TensorBoard`可视化前端

import tensorflow as tf
input1 = tf.constant([1.0,2.0,3.0],name='input1')
input2 = tf.constant([2.0,3.0,4.0],name='input2')
output = tf.add_n([input1,input2],name='add')
with tf.Session() as sess:
    writer = tf.summary.FileWriter(graph=sess.graph,logdir='./graph')
    sess.run(output)

运行上面示例代码片断,打开终端运行tensorboard --logdir='./graph' --port=6006,它的 WEB 服务器运行之后,可以通过浏览器访问可视端了.

`Tensorflow`使用笔记

`TFRecord`读写

1
2
3

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

# 定义函数转化变量类型.
def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

# mnist/data 是存放从网上下载的mnist数据位置
mnist = input_data.read_data_sets('./mnist/data',dtype=tf.uint8,one_hot=True)
images = mnist.train.images
labels = mnist.train.labels

pixels = images.shape[1]
num_examples = mnist.train.num_examples

filename = './mnist/output.tfrecords'

# 将数据转化为tf.train.Example格式.
def _make_example(pixels, label, image):
    image_raw = image.tostring()
    example = tf.train.Example(features=tf.train.Features(feature={
        'pixels': _int64_feature(pixels),
        'label': _int64_feature(np.argmax(label)),
        'image_raw': _bytes_feature(image_raw)
    }))
    return example

with tf.python_io.TFRecordWriter(filename) as writer:
    for index in range(num_examples):
        example = _make_example(pixels,labels[index],images[index])
        writer.write(example.SerializeToString())
print('写入TFRecord测试文件')

用同样的格式读取TFRecord文件记录.

reader = tf.TFRecordReader()
filename_queue = tf.train.string_input_producer(['./mnist/output.tfrecords'])
_,serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
    serialized_example,
    features={
        'pixels': tf.FixedLenFeature([],tf.int64),
        'label':tf.FixedLenFeature([],tf.int64),
        'image_raw': tf.FixedLenFeature([],tf.string),
    })

images = tf.decode_raw(features['image_raw'],tf.uint8)
labels = tf.cast(features['label'],tf.int32)
pixels = tf.cast(features['pixels'],tf.int32)


with tf.Session() as sess:
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)
    for i in range(10):
        image,label,pixel = sess.run([images,labels,pixels])
    coord.request_stop()
    coord.join(threads)

读取原始图片

import matplotlib.pyplot as plt

image_raw_data = tf.gfile.FastGFile('../img3.png','rb').read()
with tf.Session() as sess:
    img_data = tf.image.decode_png(image_raw_data)
    # 输出解码之后的三维矩阵.
    print(img_data.eval())
    plt.imshow(img_data.eval())
    plt.show()
    img_data.set_shape([420,420,3])
    print(img_data.get_shape())

调整图片的尺寸

Method 取值调整算法

0 双线性插值法(Bilinear interploation)

| 1 | 最近邻居法(Nearest neighbor interpolation) |
| 2 | 双三次插值法(Bicubic interpolation) |
| 3 | 面积插值法(Area interpolation) |

with tf.Session() as sess:
    # 如果直接以0-255范围的整数数据输入resize_images，那么输出将是0-255之间的实数，
    # 不利于后续处理.本书建议在调整图片大小前，先将图片转为0-1范围的实数.
    image_float = tf.image.convert_image_dtype(img_data,tf.float32)
    resized = tf.image.resize_images(image_float,[400,400],method=0)
    plt.imshow(resized.eval())
    plt.show()

裁剪与填充图片

with tf.Session() as sess:
    croped = tf.image.resize_image_with_crop_or_pad(img_data,300,300)
    padded = tf.image.resize_image_with_crop_or_pad(img_data,520,520)
    plt.imshow(croped.eval())
    plt.show()
    plt.imshow(padded.eval())
    plt.show()

截取中心%50区域

with tf.Session() as sess:
  central_cropped = tf.image.central_crop(img_data, 0.5)
  plt.imshow(central_cropped.eval())
  plt.show()

安装使用`Keras`

链接：

Keras是提供一些高可用的Python API,能帮助你快速的构建和训练自己的深度学习模型，它的后端是TensorFlow或者Theano.它很简约, 模块化的方法使建立并运行神经网络变得轻巧.

In [1]: import keras
Using TensorFlow backend.

In [2]: keras.__version__
Out[2]: '2.2.4'

In [3]: !cat /home/lcy/.keras/keras.json
{
    "floatx": "float32",
    "epsilon": 1e-07,
    "backend": "tensorflow",
    "image_data_format": "channels_last"
}

`Keras MNIST`手写数据测试

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import keras
from keras.models import Sequential,load_model
from keras.layers import Dense,Dropout,Conv2D,Flatten,MaxPooling2D,Activation
from keras.datasets.mnist import load_data
import os
# 清除GPU的会话数据
keras.backend.clear_session()
(X_train,Y_train),(x_test,y_test) = load_data()
X_train = X_train.reshape(X_train.shape[0],28,28,1)
x_test = x_test.reshape(x_test.shape[0],28,28,1)
input_shape=(28,28,1)
X_train = X_train.astype('float32')
x_test = x_test.astype('float32')
X_train /= 255
x_test /= 255
print('x_train shape:',X_train.shape)
print('Number of images in x_train',X_train.shape[0])
print('Number of images in x_test',x_test.shape[0])

# 卷积网络
model = Sequential()
model.add(Conv2D(28,kernel_size=(3,3),input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.02))
model.add(Dense(10))
model.add(Activation('softmax'))

#　编译多分类函数分类器
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(x=X_train,y=Y_train,epochs=10)

#　保存训练模型到HDF5文件里
history = model.fit(X_train,Y_train,batch_size=128,epochs=20,verbose=2,validation_data=(x_test,y_test))
save_dir = './results/'
mode_name='keras_mnist.h5'
mode_path = os.path.join(save_dir,mode_name)
model.save(mode_path)
print('Saved trained model at %s' % mode_path)

#　打印它的训练图形
fig = plt.figure()
plt.subplot(2,1,1)
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='lower right')

plt.subplot(2,1,2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')

plt.tight_layout()
fig

#　使用一部分测试图片验证模型
mnist_model = load_model('./results/keras_mnist.h5')
loss_and_metrics = mnist_model.evaluate(x_test,y_test,verbose=2)
print('Test Loss',loss_and_metrics[0])
print('Test Accuracy',loss_and_metrics[1])
predicted_classes =mnist_model.predict_classes(x_test)
corrent_indices = np.nonzero(predicted_classes == y_test)[0]
incorrent_indices = np.nonzero(predicted_classes != y_test)[0]
print()
print(len(corrent_indices),' classifed correctly')
print(len(incorrent_indices),' classified incorrectly')

plt.rcParams['figure.figsize'] = (7,14)
figure_evaluation = plt.figure()

# 打印９个正确预测的图片
for i ,correct in enumerate(corrent_indices[:9]):
    plt.subplot(6,3,i+1)
    plt.imshow(x_test[correct].reshape(28,28),cmap='gray',interpolation='none')
    plt.title('Predicted: {},Trutch: {}'.format(predicted_classes[correct],y_test[correct]))
    plt.xticks([])
    plt.yticks([])
# 打印９个错误预测的图片
for i ,correct in enumerate(incorrent_indices[:9]):
    plt.subplot(6,3,i+1)
    plt.imshow(x_test[correct].reshape(28,28),cmap='gray',interpolation='none')
    plt.title('Predicted: {},Truth: {}'.format(predicted_classes[correct],y_test[correct]))
    plt.xticks([])
    plt.yticks([])
figure_evaluation


#　读入自己的手动生成的图片做测试

from PIL import Image
from keras_preprocessing.image import img_to_array
from keras_applications import imagenet_utils
data_dir = '/data/AI-DIR/TensorFlow/mnist/test-data/'
list_dir = os.listdir(data_dir)
print(len(list_dir))
image_height = 28
image_width = 28
channels = 1
img_data = np.ndarray(shape=(len(list_dir),image_height,image_width,channels))
label_data = np.zeros(len(list_dir),dtype='uint8')
i = 0
for file in list_dir:
    #　读取目录下的图片,并把它转换成灰度图片.
    png = Image.open(os.path.join(data_dir,file),'r').convert('L')
    gray = png.point(lambda x: 0 if x == 255 else 255)
    image = img_to_array(gray)
    img_data[i] = image
    label_data[i] = int(file[4])
#     print('png file: ',file)
    i+=1

print('test_data len',len(img_data))
print('test_data shape',img_data.shape)
print('test label ',label_data)

loss_and_metrics = mnist_model.evaluate(img_data,label_data,verbose=2)
print('Test Loss',loss_and_metrics[0])
print('Test Accuracy',loss_and_metrics[1])

predicted_classes =mnist_model.predict_classes(img_data)
corrent_indices = np.nonzero(predicted_classes == label_data)[0]
incorrent_indices = np.nonzero(predicted_classes != label_data)[0]
print()
print(len(corrent_indices),' classifed correctly')
print(len(incorrent_indices),' classified incorrectly')
plt.rcParams['figure.figsize'] = (7,14)
figure_evaluation = plt.figure()

# 打印９个错误预测的图片
for i ,correct in enumerate(incorrent_indices[:9]):
    plt.subplot(6,3,i+1)
    plt.imshow(img_data[correct].reshape(28,28),cmap='gray',interpolation='none')
    plt.title('Predicted: {},Truth: {}'.format(predicted_classes[correct],label_data[correct]))
    plt.xticks([])
    plt.yticks([])

figure_evaluation

错误

导入`tkinter`模块

提示无法导入tkinter模块,这里安装可能就比较麻烦.错误如下:


In [4]: import matplotlib.pyplot as plt
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-4-a0d2faabd9e9> in <module>()
----> 1 import matplotlib.pyplot as plt
[...]
ModuleNotFoundError: No module named '_tkinter'

解块方法如下:

~$ apt-get install tk-dev
~$ pyenv uninstall 3.6.6
~$ pyenv install 3.6.6
~$ pyenv virtualenv py3dev
~$ pyenv activate py3dev
~$ python -m tkinter  # 测试模块.

导入`ggplot`模块

from ggplot import *语句导入 ggplot 包时报错如下:

~/.pyenv/versions/3.6.6/envs/py3dev/lib/python3.6/site-packages/ggplot/stats/smoothers.py in <module>
      2                         unicode_literals)
      3 import numpy as np
----> 4 from pandas.lib import Timestamp
      5 import pandas as pd
      6 import statsmodels.api as sm

ImportError: cannot import name 'Timestamp'

解决方法:编辑文件.../site-packages/ggplot/stats/smoothers.py.把原来的from pandas.lib import Timestamp改成from pandas import Timestamp,保存 OK.

安装`Kaggle API`

Kaggle API
在www.kaggle.com上面注册一个帐号,进入到帐号管理页面,在API一栏会有两个按钮Create New API Token,Expire API Token,点击Create New API Token浏览器就会自动下载一个名为kaggle.json的文件,并且会有 Toast 提示Ensure kaggle.json is in the location ~/.kaggle/kaggle.json to use the API.

~$ pip install kaggle
~$ mkdir ~/.kaggle
~$ mv ~/Download/kaggle.json ~/.kaggle/
~$ chmod 600 ~/.kaggle/kaggle.json

下载数据

进入到https://www.kaggle.com/competitions页面,选择一行具体的项目进去,在页面底部有一个必须接受的对话框,I Understand and Accept,否则不能下载该项目的数据.

#　下载数据到指定目录.
~$ kaggle competitions download -c traveling-santa-2018-prime-paths  -p /fullpath/Traveling-Santa-2018-Prime-Paths/
~$ kaggle competitions  list
ref                                            deadline             category            reward  teamCount  userHasEntered
---------------------------------------------  -------------------  ---------------  ---------  ---------  --------------
digit-recognizer                               2030-01-01 00:00:00  Getting Started  Knowledge       2708            True
titanic                                        2030-01-01 00:00:00  Getting Started  Knowledge      10578           False
house-prices-advanced-regression-techniques    2030-01-01 00:00:00  Getting Started  Knowledge       4519           False
imagenet-object-localization-challenge         2029-12-31 07:00:00  Research         Knowledge         30           False
competitive-data-science-predict-future-sales  2019-12-31 23:59:00  Playground           Kudos       1869           False
histopathologic-cancer-detection               2019-03-31 23:59:00  Playground       Knowledge        140           False
humpback-whale-identification                  2019-02-28 23:59:00  Featured           $25,000        144           False
elo-merchant-category-recommendation           2019-02-26 23:59:00  Featured           $50,000        630           False
ga-customer-revenue-prediction                 2019-02-15 23:59:00  Featured           $45,000       1104           False
quora-insincere-questions-classification       2019-02-05 23:59:00  Featured           $25,000       1666           False
pubg-finish-placement-prediction               2019-01-30 23:59:00  Playground            Swag        857           False
human-protein-atlas-image-classification       2019-01-10 23:59:00  Featured           $37,000       1388           False
traveling-santa-2018-prime-paths               2019-01-10 23:59:00  Featured           $25,000        958            True
[...]

使用`FFmpeg`支持`Cuda`硬件编解码

Install necessary packages.

1 2	~$ sudo apt-get install nvidia-cuda-toolkit nvidia-cuda-toolkit-gcc yasm cmake libtool \ libc6 libc6-dev unzip wget libnuma1 libnuma-dev libnvidia-encode1

For a list of supported GPUs, refer to https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new. for example: GeForce Mx350, GeForce GT1030 do have any encoder. GeForce GT1030 just have MPEG-1,MPEG-2,VC-1,H.264,H.265(4:2:0) decoder.
FFmpeg with NVIDIA GPU acceleration is supported on all Linux platforms.
To compile FFmpeg on Linux, do the following:

1 2	~$ git clone https://git.videolan.org/git/ffmpeg/nv-codec-headers.git // or mirror https://github.com/FFmpeg/nv-codec-headers ~$ cd nv-codec-headers && sudo make install

Clone FFmpeg’s public GIT repository.

~$ git clone https://git.ffmpeg.org/ffmpeg.git ffmpeg/

~$ cd ffmpeg
~$ ./configure --enable-nonfree --enable-cuda-nvcc --enable-libnpp \
    --enable-libmp3lame --enable-v4l2-m2m --enable-vdpau --enable-vaapi \
    --enable-libdrm  --enable-libx264 --enable-libvpx  --enable-libwebp \
    --enable-libv4l2 --enable-libopus --enable-libopencore-amrnb  --enable-libopencore-amrwb \
    --enable-librtmp  --enable-gpl --enable-version3    --enable-libvorbis \
    --disable-doc --disable-htmlpages --disable-manpages --disable-podpages \
    --disable-txtpages --enable-shared

./configure --enable-nonfree --enable-amf --enable-libnpp \
    --enable-libmp3lame --enable-v4l2-m2m --enable-vdpau --enable-vaapi \
    --enable-libdrm  --enable-libx264 --enable-libvpx  --enable-libwebp \
    --enable-libv4l2 --enable-libopus --enable-libopencore-amrnb  --enable-libopencore-amrwb \
    --enable-librtmp  --enable-gpl --enable-version3    --enable-libvorbis \
    --disable-doc --disable-htmlpages --disable-manpages --disable-podpages \
    --disable-txtpages --disable-static --enable-shared

~$ make -j$(nproc) && sudo make install


~$ LD_LIBRARY_PATH=/usr/local/lib ffmpeg  -hwaccels
ffmpeg version N-110065-g30cea1d39b Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 10 (Debian 10.2.1-6)
  configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --disable-static --enable-shared
  libavutil      58.  5.100 / 58.  5.100
  libavcodec     60.  6.101 / 60.  6.101
  libavformat    60.  4.100 / 60.  4.100
  libavdevice    60.  2.100 / 60.  2.100
  libavfilter     9.  4.100 /  9.  4.100
  libswscale      7.  2.100 /  7.  2.100
  libswresample   4. 11.100 /  4. 11.100
Hardware acceleration methods:
vdpau
cuda
vaapi

If your cuda installed into /usr/local/cuda. you need append following to the configure

1
2
3

~$ ./configure ....
    --extra-cflags=-I/usr/local/cuda/include \
    --extra-ldflags=-L/usr/local/cuda/lib64

运行错误

~$ ffmpeg  -hwaccel cuda -hwaccel_output_format cuda -f v4l2  -i /dev/video0  -c:a copy -c:v h264_nvenc -b:v 5M output.mp4 -y -loglevel debug
[h264_nvenc @ 0x55aacc4caf00] Driver does not support the required nvenc API version. Required: 12.0 Found: 11.1
[h264_nvenc @ 0x55aacc4caf00] The minimum required Nvidia driver for nvenc is 520.56.06 or newer
[h264_nvenc @ 0x55aacc4caf00] Nvenc unloaded

Reinstall nv-codec-headers for a suitable version of branch.

~$ dpkg -l | grep "cuda"
ii  cuda-keyring                                                1.0-1                                       all          GPG keyring for the CUDA repository
ii  libcuda1:amd64                                              470.161.03-1                                amd64        NVIDIA CUDA Driver Library
ii  libcudart11.0:amd64                                         11.2.152~11.2.2-3+deb11u3                   amd64        NVIDIA CUDA Runtime Library
ii  nvidia-cuda-dev:amd64                                       11.2.2-3+deb11u3                            amd64        NVIDIA CUDA development files
ii  nvidia-cuda-toolkit                                         11.2.2-3+deb11u3                            amd64        NVIDIA CUDA development toolkit


~$ cd nv-codec-headers
~$ git tag
n10.0.26.0
n10.0.26.1
n10.0.26.2
n11.0.10.0
n11.0.10.1
n11.0.10.2
n11.1.5.0
n11.1.5.1
n11.1.5.2
n12.0.16.0
[...]
~$ git checkout n11.1.5.2
~$  sudo make install

And then reinstall ffmpeg again.

In file included from libavutil/hwcontext_cuda.c:27:
libavutil/hwcontext_cuda.c: In function ‘cuda_context_init’:
libavutil/hwcontext_cuda.c:365:28: error: ‘CudaFunctions’ has no member named ‘cuCtxGetCurrent’; did you mean ‘cuCtxPopCurrent’?
  365 |         ret = CHECK_CU(cu->cuCtxGetCurrent(&hwctx->cuda_ctx));
      |                            ^~~~~~~~~~~~~~~
libavutil/cuda_check.h:65:114: note: in definition of macro ‘FF_CUDA_CHECK_DL’
   65 | #define FF_CUDA_CHECK_DL(avclass, cudl, x) ff_cuda_check(avclass, cudl->cuGetErrorName, cudl->cuGetErrorString, (x), #x)
      |                                                                                                                  ^
libavutil/hwcontext_cuda.c:365:15: note: in expansion of macro ‘CHECK_CU’
  365 |         ret = CHECK_CU(cu->cuCtxGetCurrent(&hwctx->cuda_ctx));
      |               ^~~~~~~~
make: *** [ffbuild/common.mak:81: libavutil/hwcontext_cuda.o] Error 1
make: *** Waiting for unfinished jobs....
CC	libavutil/hwcontext_vaapi.o
In file included from /usr/include/CL/cl.h:20,
                 from libavutil/hwcontext_opencl.h:25,
                 from libavutil/hwcontext_opencl.c:30:
/usr/include/CL/cl_version.h:22:9: note: ‘#pragma message: cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 300 (OpenCL 3.0)’
   22 | #pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 300 (OpenCL 3.0)")
      |         ^~~~~~~
STRIP	libswscale/x86/output.o

安装`VA-API`支持

nvidia-vaapi-driver
Debian 12(bookworm)直接可以安装nvidia-vaapi-driver.

1 2	~$ sudo apt-get install libnvcuvid1 libgstreamer-plugins-bad1.0-dev \ meson gstreamer1.0-plugins-bad libva-dev -y

To compile FFmpeg on Linux, do the following:

1 2	~$ git clone https://git.videolan.org/git/ffmpeg/nv-codec-headers.git ~$ cd nv-codec-headers && sudo make install

1
2
3

~$ git clone https://github.com/elFarto/nvidia-vaapi-driver
~$ cd nvidia-vaapi-driver && meson setup build
~$ meson install -c build

运行

LIBGL_DEBUG=verbose
export LIBVA_DRIVER_NAME=nvidia
~$ vainfo
libva info: VA-API version 1.17.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.17 (libva 2.12.0)
vainfo: Driver version: VA-API NVDEC driver [egl backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileHEVCMain               :	VAEntrypointVLD
      VAProfileVP9Profile0            :	VAEntrypointVLD
      VAProfileHEVCMain10             :	VAEntrypointVLD
      VAProfileHEVCMain12             :	VAEntrypointVLD
      VAProfileVP9Profile2            :	VAEntrypointVLD

vainfo错误详情输出

~$ NVD_LOG=1 vainfo
libva info: VA-API version 1.17.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so
    214878.067671198 [2010296-2010296] ../src/vabackend.c: 108                     init CUDA ERROR 'unknown error' (999)

libva info: Found init function __vaDriverInit_1_0
    214878.067694762 [2010296-2010296] ../src/vabackend.c:1872       __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 0x55a1fedadf50 10
    214878.067698198 [2010296-2010296] ../src/vabackend.c:1894       __vaDriverInit_1_0 Now have 0 (0 max) instances
    214878.067700042 [2010296-2010296] ../src/vabackend.c:1916       __vaDriverInit_1_0 Selecting EGL backend
    214878.071761148 [2010296-2010296] ../src/export-buf.c: 150       findGPUIndexFromFd Defaulting to CUDA GPU ID 0. Use NVD_GPU to select a specific CUDA GPU
    214878.071770746 [2010296-2010296] ../src/export-buf.c: 163       findGPUIndexFromFd Looking for GPU index: 0
    214878.073034516 [2010296-2010296] ../src/export-buf.c: 175       findGPUIndexFromFd Found 3 EGL devices
    214878.074061472 [2010296-2010296] ../src/export-buf.c: 229       findGPUIndexFromFd No EGL_CUDA_DEVICE_NV support for EGLDevice 0
    214878.074069096 [2010296-2010296] ../src/export-buf.c: 229       findGPUIndexFromFd No EGL_CUDA_DEVICE_NV support for EGLDevice 1
    214878.074074135 [2010296-2010296] ../src/export-buf.c: 232       findGPUIndexFromFd No DRM device file for EGLDevice 2
    214878.074076840 [2010296-2010296] ../src/export-buf.c: 235       findGPUIndexFromFd No match found, falling back to default device
    214878.075083408 [2010296-2010296] ../src/export-buf.c: 289         egl_initExporter Driver supports 16-bit surfaces
    214878.075096823 [2010296-2010296] ../src/vabackend.c:1948       __vaDriverInit_1_0 CUDA ERROR 'initialization error' (3)

    214878.075100831 [2010296-2010296] ../src/export-buf.c:  65      egl_releaseExporter Releasing exporter, 0 outstanding frames
    214878.075109497 [2010296-2010296] ../src/export-buf.c:  82      egl_releaseExporter Done releasing frames
libva error: /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so init failed
libva info: va_openDriver() returns 1
vaInitialize failed with error code 1 (operation failed),exit

上面的错误是在hibernate之后出现，需要重加载nvidia_uvm内核驱动

1 2	sudo rmmod nvidia_uvm sudo modprobe nvidia_uvm

创建systemd服务处理hiberante后的调设置与模块重载

~$ cat /etc/pm/sleep.d/after-hibernate.sh
#!/bin/bash
# on bookworm will get vaInitialize failed with error code 1 after hibernate.
rmmod nvidia_uvm
modprobe nvidia_uvm

exit 0

systemd service

~$ cat /etc/systemd/system/rfh.service
[Unit]
Description=Run script after hibernate recovery
#After=suspend.target
After=hibernate.target
#After=hybrid-sleep.target
[Service]
ExecStart=/etc/pm/sleep.d/after-hibernate.sh
[Install]
#WantedBy=suspend.target
WantedBy=hibernate.target
#WantedBy=hybrid-sleep.target

~$ systemctl enable rfh

Gstreamer使用

Nvidia Hardware accelerated video Encoding/Decoding (nvcodec) — GStreamer
NVIDIA Optimus
上面vainfo是 NVIDIA Corporation GP108 [GeForce GT 1030]显卡的输出,只支持上述的解码，不支持任何的硬件编码。下面再看对于gstreamer的支持。

export LIBVA_DRIVER_NAME=nvidia
export GST_VAAPI_ALL_DRIVERS=1

~$ gst-inspect-1.0 vaapi
Plugin Details:
  Name                     vaapi
  Description              VA-API based elements
  Filename                 /lib/x86_64-linux-gnu/gstreamer-1.0/libgstvaapi.so
  Version                  1.22.0
  License                  LGPL
  Source module            gstreamer-vaapi
  Documentation            https://gstreamer.freedesktop.org/documentation/vaapi/
  Source release date      2023-01-23
  Binary package           gstreamer-vaapi
  Origin URL               https://tracker.debian.org/pkg/gstreamer-vaapi

  vaapidecodebin: VA-API Decode Bin
  vaapih264dec: VA-API H264 decoder
  vaapih265dec: VA-API H265 decoder
  vaapimpeg2dec: VA-API MPEG2 decoder
  vaapisink: VA-API sink
  vaapivc1dec: VA-API VC1 decoder
  vaapivp9dec: VA-API VP9 decoder

  7 features:
  +-- 7 elements

如上面所示，只持上面的解码，并且nvidia-vaapi-driver还不支持vaapipostproc: VA-API video postprocessing,因此还无法使用vaapisink硬件解码播放。

GST_DEBUG=nvdec*:6,nvenc*:6 gst-inspect-1.0 nvdec
~$ gst-inspect-1.0 nvcodec
Plugin Details:
  Name                     nvcodec
  Description              GStreamer NVCODEC plugin
  Filename                 /lib/x86_64-linux-gnu/gstreamer-1.0/libgstnvcodec.so
  Version                  1.22.0
  License                  LGPL
  Source module            gst-plugins-bad
  Documentation            https://gstreamer.freedesktop.org/documentation/nvcodec/
  Source release date      2023-01-23
  Binary package           GStreamer Bad Plugins (Debian)
  Origin URL               https://tracker.debian.org/pkg/gst-plugins-bad1.0

  cudaconvert: CUDA colorspace converter
  cudaconvertscale: CUDA colorspace converter and scaler
  cudadownload: CUDA downloader
  cudascale: CUDA video scaler
  cudaupload: CUDA uploader
  nvh264dec: NVDEC h264 Video Decoder
  nvh264sldec: NVDEC H.264 Stateless Decoder
  nvh265dec: NVDEC h265 Video Decoder
  nvh265sldec: NVDEC H.265 Stateless Decoder
  nvjpegdec: NVDEC jpeg Video Decoder
  nvmpeg2videodec: NVDEC mpeg2video Video Decoder
  nvmpeg4videodec: NVDEC mpeg4video Video Decoder
  nvmpegvideodec: NVDEC mpegvideo Video Decoder
  nvvp9dec: NVDEC vp9 Video Decoder
  nvvp9sldec: NVDEC VP9 Stateless Decoder

测式硬解播放h264文件

1	~$ sudo apt-get install gstreamer1.0-plugins-base-apps


~$ gst-discoverer-1.0 test.x264.AAC5.1.mp4
Done discovering test.x264.AAC5.1.mp4
Missing plugins
 (gstreamer|1.0|gst-discoverer-1.0|GStreamer element vaapipostproc|element-vaapipostproc)

Properties:
  Duration: 1:39:44.736000000
  Seekable: yes
  Live: no
  container #0: Quicktime
    video #1: H.264 (High Profile)
      Stream ID: 1a5271d9ce1c168fa86e3c3727d54d189e469c400ed56a5836c778c4ddd01ac6/001
      Width: 1920
      Height: 1036
      Depth: 24
      Frame rate: 24000/1001
      Pixel aspect ratio: 1/1
      Interlaced: false
      Bitrate: 2249690
      Max bitrate: 31250000
    audio #2: MPEG-4 AAC
      Stream ID: 1a5271d9ce1c168fa86e3c3727d54d189e469c400ed56a5836c778c4ddd01ac6/002
      Language: <unknown>
      Channels: 6 (front-left, front-right, front-center, lfe1, rear-left, rear-right)
      Sample rate: 48000
      Depth: 32
      Bitrate: 384000
      Max bitrate: 384000

NVDEC H.264 Stateless Decoder解码,

export LIBVA_DRIVER_NAME=nvidia
export GST_VAAPI_ALL_DRIVERS=1
export GST_VAAPI_DRM_DEVICE=/dev/dri/renderD128

~$ gst-launch-1.0 filesrc location=test.x264.AAC5.1.mp4 ! parsebin ! nvh264sldec ! videoconvert ! xvimagesink

~$ nvidia-smi
Sun May  7 22:40:07 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:07:00.0  On |                  N/A |
| N/A   49C    P0    N/A /  30W |    726MiB /  2048MiB |     18%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     11453      G   /usr/lib/xorg/Xorg                294MiB |
|    0   N/A  N/A     12305      G   ...e/michael/firefox/firefox      268MiB |
|    0   N/A  N/A     93886      G   ...AAAAAAAAA= --shared-files        1MiB |
|    0   N/A  N/A     95739      G   ...RendererForSitePerProcess       74MiB |
|    0   N/A  N/A    119708      C   gst-launch-1.0                     82MiB |
+-----------------------------------------------------------------------------+

使用libav h264 decoder, CPU会比上面的高10%左右。

1	~$ gst-launch-1.0 filesrc location=test4.mp4 ! parsebin ! avdec_h264 ! videoconvert ! xvimagesink

使用glimagesink测试速度,fpsdisplaysink会显示当前的帧率。

1
2

~$ sudo apt-get install gstreamer1.0-gl
~$ GST_VAAPI_DRM_DEVICE=/dev/dri/renderD128 gst-launch-1.0 filesrc location=test4.mp4 ! parsebin ! nvh264sldec ! videoconvert ! fpsdisplaysink video-sink=glimagesink sync=false

错误记录分析

下面的错误是在我本机已经存在/usr/local/include/va,并且它的版本或者比较低，没有这些结构体。但是在/usr/include/va目录下是系统安装了libva-dev所生成，里面的结构体是符合要求。meson可能先找到/usr/local/include/va,并且#include <va/va.h>就会忽略掉/usr/include/va/va.h.

~$ cc -Invidia_drv_video.so.p -I. -I.. -I../nvidia-include  -I/usr/local/include -I/usr/include -I/usr/include/libdrm -I/usr/include/gstreamer-1.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/x86_64-linux-gnu -fvisibility=hidden -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=gnu11 -g -Wno-missing-field-initializers -Wno-unused-parameter -Werror=format -Werror=init-self -Werror=int-conversion -Werror=missing-declarations -Werror=missing-prototypes -Werror=pointer-arith -Werror=undef -Werror=vla -Wsuggest-attribute=format -Wwrite-strings -fPIC -pthread -MD -MQ nvidia_drv_video.so.p/src_h264.c.o -MF nvidia_drv_video.so.p/src_h264.c.o.d -o nvidia_drv_video.so.p/src_h264.c.o -c ../src/h264.c^C
michael@debian:~/3TB-DISK/github/nvidia-driver/nvidia-vaapi-driver/build$ ^C
michael@debian:~/3TB-DISK/github/nvidia-driver/nvidia-vaapi-driver/build$ cc -Invidia_drv_video.so.p -I. -I.. -I../nvidia-include  -I/usr/local/include -I/usr/include -I/usr/include/libdrm -I/usr/include/gstreamer-1.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -fvisibility=hidden -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=gnu11 -g -Wno-missing-field-initializers -Wno-unused-parameter -Werror=format -Werror=init-self -Werror=int-conversion -Werror=missing-declarations -Werror=missing-prototypes -Werror=pointer-arith -Werror=undef -Werror=vla -Wsuggest-attribute=format -Wwrite-strings -fPIC -pthread -MD -MQ nvidia_drv_video.so.p/src_h264.c.o -MF nvidia_drv_video.so.p/src_h264.c.o.d -o nvidia_drv_video.so.p/src_h264.c.o -c ../src/h264.c
In file included from ../src/h264.c:1:
../src/vabackend.h:123:77: error: unknown type name ‘VADRMPRIMESurfaceDescriptor’
  123 |     bool (*fillExportDescriptor)(struct _NVDriver *drv, NVSurface *surface, VADRMPRIMESurfaceDescriptor *desc);
      |                                                                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/h264.c:133:1: warning: ‘retain’ attribute directive ignored [-Wattributes]
  133 | const DECLARE_CODEC(h264Codec) = {
      | ^~~~~

下面这个也是因为在本地存在/usr/local/include/EGL所导致的。

FAILED: nvidia_drv_video.so.p/src_export-buf.c.o
cc -Invidia_drv_video.so.p -I. -I.. -I../nvidia-include -I/usr/local/include -I/usr/include/libdrm -I/usr/include/gstreamer-1.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/x86_64-linux-gnu -fvisibility=hidden -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -g -Wno-missing-field-initializers -Wno-unused-parameter -Werror=format -Werror=incompatible-pointer-types -Werror=init-self -Werror=int-conversion -Werror=missing-declarations -Werror=missing-prototypes -Werror=pointer-arith -Werror=undef -Werror=vla -Wsuggest-attribute=format -Wwrite-strings -fPIC -pthread -MD -MQ nvidia_drv_video.so.p/src_export-buf.c.o -MF nvidia_drv_video.so.p/src_export-buf.c.o.d -o nvidia_drv_video.so.p/src_export-buf.c.o -c ../src/export-buf.c
../src/export-buf.c: In function ‘egl_initExporter’:
../src/export-buf.c:242:5: error: unknown type name ‘PFNEGLQUERYDMABUFFORMATSEXTPROC’; did you mean ‘PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC’?
  242 |     PFNEGLQUERYDMABUFFORMATSEXTPROC eglQueryDmaBufFormatsEXT = (PFNEGLQUERYDMABUFFORMATSEXTPROC) eglGetProcAddress("eglQueryDmaBufFormatsEXT");
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |     PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC
../src/export-buf.c:242:65: error: ‘PFNEGLQUERYDMABUFFORMATSEXTPROC’ undeclared (first use in this function); did you mean ‘PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC’?
  242 |     PFNEGLQUERYDMABUFFORMATSEXTPROC eglQueryDmaBufFormatsEXT = (PFNEGLQUERYDMABUFFORMATSEXTPROC) eglGetProcAddress("eglQueryDmaBufFormatsEXT");
      |                                                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                                                 PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC
../src/export-buf.c:242:65: note: each undeclared identifier is reported only once for each function it appears in
../src/export-buf.c:242:98: error: expected ‘,’ or ‘;’ before ‘eglGetProcAddress’
  242 |     PFNEGLQUERYDMABUFFORMATSEXTPROC eglQueryDmaBufFormatsEXT = (PFNEGLQUERYDMABUFFORMATSEXTPROC) eglGetProcAddress("eglQueryDmaBufFormatsEXT");
      |                                                                                                  ^~~~~~~~~~~~~~~~~
../src/export-buf.c:265:9: error: called object ‘eglQueryDmaBufFormatsEXT’ is not a function or function pointer
  265 |     if (eglQueryDmaBufFormatsEXT(drv->eglDisplay, 64, formats, &formatCount)) {

Nvidia-drm错误

~$ dmesg
[...]
[38810.269044] [drm:drm_new_set_master [drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to grab modeset ownership
[41522.270711] [drm:drm_new_set_master [drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to grab modeset ownership
[42735.271307] [drm:drm_new_set_master [drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to grab modeset ownership
[44347.269266] [drm:drm_new_set_master [drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to grab modeset ownership

上面错误是因为设置了options nvidia-drm modeset=1, 请确保在下面这些文件，或者/etc/modprobe.d,/usr/lib/modproble.d目录中的文件的,都不能有此设置。

~$ grep --include=*.conf -rnw /usr/lib/  -e "nvidia-drm"
/usr/lib/modprobe.d/nvidia-installer-disable-nouveau.conf

~$ sudo grep --include=*.conf -rnw /etc/  -e "nvidia-drm"
/etc/nvidia/current/nvidia-modprobe.conf:2:options nvidia-drm modset=1
/etc/nvidia/current/nvidia-modprobe.conf:5:install nvidia-drm modprobe nvidia-modeset ; modprobe -i nvidia-current-drm $CMDLINE_OPTS
/etc/nvidia/current/nvidia-modprobe.conf:13:remove nvidia modprobe -r -i nvidia-drm nvidia-modeset nvidia-peermem nvidia-uvm nvidia
/etc/nvidia/current/nvidia-modprobe.conf:15:remove nvidia-modeset modprobe -r -i nvidia-drm nvidia-modeset
/etc/nvidia/current/nvidia-load.conf:1:nvidia-drm
/etc/nvidia/current/nvidia-drm-outputclass.conf:3:# nvidia-drm.ko kernel module.  Please note that this only works on Linux kernels
/etc/nvidia/current/nvidia-drm-outputclass.conf:4:# version 3.9 or higher with CONFIG_DRM enabled, and only if the nvidia-drm.ko
/etc/nvidia/current/nvidia-drm-outputclass.conf:9:    MatchDriver    "nvidia-drm"
````

* `polkitd segfault`
* [interpreting-segfault-messages](https://stackoverflow.com/questions/2549214/interpreting-segfault-messages/2549363#2549363)


```sh
~$ dmesg
 polkitd[99838]: segfault at 8 ip 0000564f56f95736 sp 00007ffe8b5fa800 error 4 in polkitd[564f56f91000+e000] likely on CPU 1 (core 1, socket 0)

出现上面错误,有可能是与系统的库有兼容问题，或者重新也是不能覆盖旧的文件，需要执行下面三个步骤：
- sudo apt-get remove -y policykit-1;
- dpkg --purge policykit-1;
- 再手动删除/etc/policykit-1,仔细确认上面的命令是否完全删除干净。
- sudo apt-get install policykit-1 -y;

谢谢支持

微信二维码:

AVR单片机指南

发表于 2021-01-08 更新于 2023-11-28

参考:
- Atmel AVR
Atmel AVR系列是一种基于改进的哈佛结构、8位～32位精简指令集(Reduced Instruction Set Computing,RISC)的微控制器,由Atmel公司于1996年研发.AVR系列是首次采用闪存(Flash Memory)作为数据存储介质的单芯片微控制器之一,同时代的其它微控制器多采用一次写入可编程ROM、EPROM或是EEPROM.目前AVR处理器发展了六个系列,分别是:tinyAVR,ATtiny系列;megaAVR,ATmega系列;XMEGA,ATxmega系列;Application-specific AVR,面向特殊应用的AVR系列,增加LCD控制器、USB控制器、PWM等特性;FPSLIC,FPGA上的AVR核;AVR32,32位AVR系列,包含SIMD和DSP以及音视频处理特性,与ARM架构形成争.

ATmega32U4(Arduino pro micro)

sparkfun/Arduino_Boards
atmega-asm
MiniCore
Arduino-Based (ATmega32U4) Mouse and Keyboard Controller
The Lost Art of Structure Packing

连接ICSP烧写bootloader.

        (2232HIO)
        FT232H              ATmega32U4

    pin13 ADBUS0   <------>   SCK    pin15
    pin14 ADBUS1   <------>   MOSI   pin16
    pin15 ADBUS2   <------>   MISO   pin14
    pin16 ADBUS3   <------>   Reset  RST
            GND    <------>   GND
          +3.3V    <------>   +3.3V

~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB0 -b 19200 -p atmega32u4  -U lfuse:r:-:i

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9587 (probably m32u4)
avrdude: reading lfuse memory:

Reading | ################################################## | 100% 0.00s

avrdude: writing output file "<stdout>"
:01000000FF00
:00000001FF

avrdude: safemode: Fuses OK (E:CB, H:D8, L:FF)

avrdude done.  Thank you.

some of valid programmers for FTDI

2232HIO          = FT2232H based generic programmer
4232h            = FT4232H based generic programmer
ft232r           = FT232R Synchronous BitBang
ft245r           = FT245R Synchronous BitBang
ttl232r          = FTDI TTL232R-5V with ICSP adapter
UM232H           = FT232H based module from FTDI and Glyn.com.au

添加sparkfun/Arduino_Boards,让Arduino IDE支持更多的种类的板子,添加URL后,通过Tools -> Boards Manager 安装SparkFun AVR Boards.烧写SparkFun bootloader.

~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB0 -b 19200 -p m32u4  -U flash:w:.arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9587 (probably m32u4)
avrdude: NOTE: "flash" memory has been specified, an erase cycle will be performed
         To disable this feature, specify the -D option.
avrdude: erasing chip
avrdude: reading input file ".arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex"
avrdude: input file .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex auto detected as Intel Hex
avrdude: writing flash (32762 bytes):

Writing | ################################################## | 100% 0.00s

avrdude: 32762 bytes of flash written
avrdude: verifying flash memory against .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex:
avrdude: load data flash data from input file .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex:
avrdude: input file .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex auto detected as Intel Hex
avrdude: input file .arduino15/packages/SparkFun/hardware/avr/1.1.13/bootloaders/caterina/Caterina-promicro16.hex contains 32762 bytes
avrdude: reading on-chip flash data:

Reading | ################################################## | 100% 0.00s

avrdude: verifying ...
avrdude: 32762 bytes of flash verified

avrdude: safemode: Fuses OK (E:CB, H:D8, L:FF)

avrdude done.  Thank you.

ATmega328p (Arduino Pro mini with CH340)

烧写好bootloader就可以使用USB在Arduino IDE上进行开发.
选择Tools -> Board -> Arduino Pro or Pro Mini, 烧写器AVRISP mkii.

`UART`通信的要点

不像其他的通讯协议，UART沒有clock信号可供参考，所以双方需要事先知道彼此的baud rate，才知道双方是以多快的速度传送数据,必须条件：
- 两个硬件设备必须要共地(GND)
- Baud rate必须相同。
Baud rate为9600，每个bit的时间应该是1/9600秒，对于CPU来说是几个cycle呢？如果CPU频率是9600Hz，这样刚好就是1 cycle传输一个bit.

ATtiny85(CJMCU)

Links:
- ATTiny10IDE
- [Tutorial : How to program the CJMCU ATTiny85 (LilyTiny / LilyPad)](https://diyprojects.io/ tutorial-program-cjmcu-attiny85-lilytiny-lilypad/)
- Digispark DIY: the Smallest USB Arduino
- Tiny AVR Programmer Hookup Guide
- Arduino / AVR #
- Little Wire
- ATtiny85 I2C protocol tutorial
- ATtiny85 Snake Game Handheld
- MiniCore

Bootloader

ATtiny85 USB Boot Loader: Details

micronucleus是一个可以支持跨平台的USB上传烧写的bootloader,体积在2kb以内.


        FT232H              ATTiny85

    pin13 ADBUS0   <------>   SCK    PB2
    pin14 ADBUS1   <------>   MOSI   PB0
    pin15 ADBUS2   <------>   MISO   PB1
    pin16 ADBUS3   <------>   Reset  PB5
            GND    <------>   GND
            +5V    <------>   +5V

~ micronucleus/firmware/releases$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB1 -b 19200 -p attiny85 -U flash:w:t85_default.hex -U lfuse:w:0xe2:m -U hfuse:w:0xdd:m -U efuse:w:0xfe:m

接入USB会发现如下的设备:

~$ lsusb
[...]
Bus 004 Device 007: ID 16d0:0753 MCS Digistump DigiSpark
[...]

复制micronucleus/commandline/49-micronucleus.rules到系统的/etc/udev/rules.d/目录内.

系统控制与复位

这里要讲到关于熔丝位(fuse)的技术.具体的技术细节需要查看Datasheet的20. Memory Programming.可以根据文档去配置编程使能相应的字节的位.也可以通过这个https://www.engbedded.com/fusecalc/配置得出三个字节去配置熔丝位.lfuse表示低位,hfuse表示高位,efuse表示扩展位.
在配置编程熔丝位时有几个问题要注意,比如:把SPIEN,JTAGEN的位设定为未编程状态,这将使芯片失去了JTAG与SPI接口的功能,不能重新烧写,从而以致单片机锁死,出现这种情况时就需要高压(12v)并行编程方式才能将单片机的功能恢复.另一个问题是要启动地址的错误,如果没有开启单片机的BOOTLOADER功能,就不要设置BOOTRST的编程位为0(已编程),否则单片机在上电时不是从Flash的0x0000开始运行的,而是转到BOOT区执行,从而导致单片机无法正确运行.

使用`Arduino IDE`支持(ATTinyCore)

ATTinyCore是让最新的Arduino IDE支持ATTiny系列的单片机,安装流程当然也就是按照https://github.com/SpenceKonde/ATTinyCore/blob/master/Installation.md操作,在Arduino IDE -> File->Preferences加入http://drazzy.com/package_drazzy.com_index.json,并且更新安装ATTinyCore的库.之后在``Arduino IDE -> Tools -> Board -> ATTinyCore`里面可以选择目标的单片机.
有可能ATTinyCore自带的micronucleus版本太低与Attiny85内烧写版本的不匹配,就会出现下面的错误,具体的版本可以查看~/.arduino15/packages/ATTinyCore/tools/micronucleus/2.0a4/.关于ATTinyCore所支持的硬件与固件配置可以查看~/.arduino15/packages/ATTinyCore/hardware/avr/1.4.1/bootloaders.

1
2
3

Warning: device with unknown new version of Micronucleus detected.
This tool doesn\'t know how to upload to this new device. Updates may be available.
Device reports version as: 2.4

关于上面的警告提示,需要更新micronucleus版本.

1 2	~$ cd micronucleus/commandline && make ~$ cp micronucleus ~/.arduino15/packages/ATTinyCore/tools/micronucleus/2.0a4/

使用`Arduino`上传第一个程序(blink).

选择主板:Tools -> Board -> ATTinyCore -> ATtiny85(Micronucleus/DigiSpark)
烧写方式:Tools -> Burn Bootloader Method: "Upgrade (via USB)"
基本上选择了正确的主板,其它参数默认就可以了,测试图如下:

测式程序是:File -> Examples -> Built-in examples -> 01.Basics -> Blink. 只是重定义了LED_BUILTIN的IO口,如下:

#define LED_BUILTIN 1

// the setup function runs once when you press reset or power the board
void setup() {
  // initialize digital pin LED_BUILTIN as an output.
  pinMode(LED_BUILTIN, OUTPUT);
}

// the loop function runs over and over again forever
void loop() {
  digitalWrite(LED_BUILTIN, HIGH);   // turn the LED on (HIGH is the voltage level)
  delay(1000);                       // wait for a second
  digitalWrite(LED_BUILTIN, LOW);    // turn the LED off by making the voltage LOW
  delay(1000);                       // wait for a second
}

烧写提示如下:

Plug in device now... (will timeout in 60 seconds)
> Please plug in the device ...
> Press CTRL+C to terminate the program.
> Device is found!
connecting: 16% complete
connecting: 22% complete
connecting: 28% complete
connecting: 33% complete
> Device has firmware version 2.4
> Device signature: 0x1e930b
> Available space for user applications: 6522 bytes
> Suggested sleep time between sending pages: 7ms
> Whole page count: 102  page size: 64
> Erase function sleep duration: 714ms
parsing: 50% complete
> Erasing the memory ...
erasing: 55% complete
erasing: 60% complete
erasing: 65% complete
> Starting to upload ...
writing: 70% complete
writing: 75% complete
writing: 80% complete
> Starting the user app ...
running: 100% complete
>> Micronucleus done. Thank you!

新增烧写器(UM232H为例)

Platform specification

~$ tree -L 2 ~/.arduino15/packages/
/home/michael/.arduino15/packages/
├── arduino
│   ├── hardware
│   └── tools
├── atmel-avr-xminis
│   └── hardware
├── ATTinyCore
│   ├── hardware
│   └── tools
├── esp32
│   ├── hardware
│   └── tools
├── SparkFun
│   └── hardware
└── STM32
    ├── hardware
    └── tools

在~/.arduino15/packages/内的各种包内结构如下,基本每一个包(package: i.e: ardunion,ATTinyCore )下面的hardware\<arch>\<version\内都有boards.txt,programmers.txt文件.
而在包下面的hardware\tools\内包内,包含这个包所支持的工具链,如:编译器,烧写器,还有一些特定的工具等.这里以avrdude为例,在Arduino IDE烧写AVR的板子时候,它会调用包内的avrdude与配置文件,如:

1 2	.arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/bin/avrdude \ -c .arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/etc/avrdude.conf

下面是让ATTinyCore包内的Attiny85通过使用UM232H烧写器,在Arduino IDE内烧写,无需其它的bootloader支持.
首先在~/.arduino15/packages/ATTinyCore/hardware/avr/1.4.1/programmers.txt内,加入以下内容:

um232h.name=UM232H as ISP
um232h.communication=serial
um232h.protocol=UM232H
um232h.speed=19200
um232h.program.protocol=UM232H
um232h.program.speed=19200
um232h.program.tool=avrdude
um232h.program.extra_params=-P{serial.port} -b{program.speed}

重启Arduino IDE会发现,选择ATTinyCore包类的板子,在烧写器一栏,会看到UM232H as ISP. 如果在某个包类没有在定义programmers.txt,它就会使用目标板子在arduino体系内所对应~/.arduino15/packages/arduino/hardware/<arch>/<version>/programmers.txt

如:

1	~$ .arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/bin/avrdude -C .arduino15/packages/ATTinyCore/hardware/avr/1.4.1/avrdude.conf

如果这里的.arduino15/packages/ATTinyCore/hardware/avr/1.4.1/avrdude.conf文件内没有支持UM232H配置,需要在avrdude.conf加入下面内容:

# UM232H module from FTDI and Glyn.com.au.
# See helix.air.net.au for detailed usage information.
# J1: Connect pin 2 and 3 for USB power.
# J2: Connect pin 2 and 3 for USB power.
# J2: Pin 7 is SCK
#   : Pin 8 is MOSI
#   : Pin 9 is MISO
#   : Pin 11 is RST
#   : Pin 6 is ground
# Use the -b flag to set the SPI clock rate eg -b 3750000 is the fastest I could get
# a 16MHz Atmega1280 to program reliably.  The 232H is conveniently 5V tolerant.
programmer
  id         = "UM232H";
  desc       = "FT232H based module from FTDI and Glyn.com.au";
  type       = "avrftdi";
  usbvid     = 0x0403;
# Note: This PID is reserved for generic 232H devices and
# should be programmed into the EEPROM
  usbpid     = 0x6014;
  usbdev     = "A";
  usbvendor  = "";
  usbproduct = "";
  usbsn      = "";
#ISP-signals
  sck    = 0;
  mosi   = 1;
  miso   = 2;
  reset  = 3;
;

如在运行烧写时出现在下面错误,也就是在一些avrdude.conf内没有支持UM232H的原因之一.

1	avrdude: Error: no libftdi or libusb support. Install libftdi1/libusb-1.0 or libftdi/libusb and run configure/make again.

这里比较简单解决办法是,使用系统的/usr/bin/avrdude来替换.arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/bin/avrdude

为什么Arduino IDE为会调用包内的工具(avrdude),因为它的路径定义如下:

~$ grep "avrdude.path" ~/.arduino15/packages/arduino/hardware/avr/1.8.3/platform.txt
tools.avrdude.path={runtime.tools.avrdude.path}

~$ grep "avrdude.path" ~/.arduino15/packages/ATTinyCore/hardware/avr/1.4.1/platform.txt
tools.avrdude.path={runtime.tools.avrdude.path}

最后,新增其它的种类的烧写器也是类似,如:FT2232HL等.这种方式,就是可以使用Arduino IDE生态内的软件库,所带来快速开发与测试硬件的优势.也可使用Makefile的方式使用avrdude来烧写.

添加`2232HL`

打开/etc/avrdude.conf文件发现，里面默认定义了FT2232H，FT4232H的配置如下：

~$ cat /etc/avrdude.conf
[...]
programmer
    id         = "2232HIO";
    desc       = "FT2232H based generic programmer";
    type       = "avrftdi";
    connection_type = usb;
    usbvid     = 0x0403;
  # Note: This PID is reserved for generic H devices and
  # should be programmed into the EEPROM
  #  usbpid     = 0x8A48;
    usbpid     = 0x6010;
    usbdev     = "A";
    usbvendor  = "";
    usbproduct = "";
    usbsn      = "";
  #ISP-signals
    reset  = 3;
    sck    = 0;
    mosi   = 1;
    miso   = 2;
    buff   = ~4;
  #LED SIGNALs
    errled = ~ 11;
    rdyled = ~ 14;
    pgmled = ~ 13;
    vfyled = ~ 12;
  ;

  #The FT4232H can be treated as FT2232H, but it has a different USB
  #device ID of 0x6011.
  programmer parent "avrftdi"
    id         = "4232h";
    desc       = "FT4232H based generic programmer";
    usbpid     = 0x6011;
  ;
[...]

如上所示,在系统级的avrdude已经支持FT2232H,这里只需要硬件库里programmers.txt添加一个对应到FT2232H的项就可以了。但是一般在硬件库里，还有一份avrdude.conf,按顺序会先是检查硬件库里的相关配置。这里还是以ATTinyCore的硬件库为例：

~$ tail -n 10 ~/.arduino15/packages/ATTinyCore/hardware/avr/1.5.2/programmers.txt


ft2232h.name=2232HIO as ISP
ft2232h.communication=serial
ft2232h.protocol=avrftdi
ft2232h.speed=19200
ft2232h.program.protocol=avrftdi
ft2232h.program.speed=19200
ft2232h.program.tool=avrdude
ft2232h.program.extra_params=-P{serial.port} -b{program.speed}

ft2232h连接attiny85

 FT2232H              ATTiny85

ADBUS0   <------>   SCK    PB2
ADBUS1   <------>   MOSI   PB0
ADBUS2   <------>   MISO   PB1
ADBUS3   <------>   Reset  PB5
  GND    <------>   GND
  +5V    <------>   +5V  (VIN)

`AVRDude`烧写

前面是在ATTiny85Flash里烧写一个bootloader开启SELFPRGEN Self-Programming Enable与SPIEN Enable Serial Program and Data Downloading的功能,优点就是让它能通过USB(D-: PB3/AD3,D+: PB4/AD2)可以烧写程序,可以简单与Arduino IDE集成使用,不需要外接烧写器.缺点就是要消耗2kb的存储空间,但是Attiny85就只有8kb的Flash空间.

下面就是通过使用UM232H像烧写bootloader的方法去开发编程,可以完全使用8kb的空间.下面是一个简单blink示例.

~ blink$ cat main.c
// main.c
//
// A simple blinky program for ATtiny85
// Connect red LED at pin 2 (PB1)
//
// electronut.in

#include <avr/io.h>
#include <util/delay.h>

int main (void)
{
  // set PB1 to be output
	DDRB = 0b00000010;
  while (1) {

    // flash# 1:
    // set PB1 high
    PORTB = 0b00000010;
    _delay_ms(20);
    // set PB1 low
    PORTB = 0b00000000;
    _delay_ms(20);

    // flash# 2:
    // set PB1 high
    PORTB = 0b00000010;
    _delay_ms(200);
    // set PB1 low
    PORTB = 0b00000000;
    _delay_ms(200);
  }

  return 1;
}

Makefile

# Makefile for programming the ATtiny85
# modified the one generated by CrossPack

DEVICE      = attiny85
CLOCK      = 8000000
PROGRAMMER = -c UM232H
OBJECTS    = main.o
# for ATTiny85
# see http://www.engbedded.com/fusecalc/
FUSES       = -U lfuse:w:0x62:m -U hfuse:w:0xdf:m -U efuse:w:0xff:m

# Tune the lines below only if you know what you are doing:
AVRDUDE = avrdude $(PROGRAMMER) -p $(DEVICE)
COMPILE = avr-gcc -Wall -Os -DF_CPU=$(CLOCK) -mmcu=$(DEVICE)

# symbolic targets:
all:	main.hex

.c.o:
	$(COMPILE) -c $< -o $@

.S.o:
	$(COMPILE) -x assembler-with-cpp -c $< -o $@

.c.s:
	$(COMPILE) -S $< -o $@

flash:	all
	$(AVRDUDE) -U flash:w:main.hex:i

fuse:
	$(AVRDUDE) $(FUSES)

# Xcode uses the Makefile targets "", "clean" and "install"
install: flash fuse

# if you use a bootloader, change the command below appropriately:
load: all
	bootloadHID main.hex

clean:
	rm -f main.hex main.elf $(OBJECTS)

# file targets:
main.elf: $(OBJECTS)
	$(COMPILE) -o main.elf $(OBJECTS)

main.hex: main.elf
	rm -f main.hex
	avr-objcopy -j .text -j .data -O ihex main.elf main.hex
	avr-size --format=avr --mcu=$(DEVICE) main.elf
# If you have an EEPROM section, you must also create a hex file for the
# EEPROM and add it to the "flash" target.

# Targets for code debugging and analysis:
disasm:	main.elf
	avr-objdump -d main.elf

cpp:
	$(COMPILE) -E main.c

接线按照上面方法,因为这里是没有用使用bootloader,直接在blink目录下运行make flash就通过使用avrdude烧写到flash中.

也可以单独使用下面命令烧写.

1	~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB1 -b 19200 -p attiny85 -U flash:w:main.hex:i

高压编程恢复`熔丝位(fuse)`锁死

Links
恢复熔丝位(fuse)还是有一点麻烦,按照博文High Voltage programming/Unbricking for Attiny指导,需要有一个Arduino设备,或者说至少要一个有6个IO口的单片机.还需要6个1k的电阻,一个(npn)的三极管,一个12V的电压源.如图：

读取熔丝位(fuse)

~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB1 -b 19200 -p attiny85  -U lfuse:r:-:i -v

avrdude: Version 6.3-20171130
         Copyright (c) 2000-2005 Brian Dean, http://www.bdmicro.com/
         Copyright (c) 2007-2014 Joerg Wunsch

         System wide configuration file is "/etc/avrdude.conf"
         User configuration file is "/home/michael/.avrduderc"
         User configuration file does not exist or is not a regular file, skipping

         Using Port                    : /dev/ttyUSB1
         Using Programmer              : UM232H
         Overriding Baud Rate          : 19200
         AVR Part                      : ATtiny85
         Chip Erase delay              : 4500 us
         PAGEL                         : P00
         BS2                           : P00
         RESET disposition             : possible i/o
         RETRY pulse                   : SCK
         serial program mode           : yes
         parallel program mode         : yes
         Timeout                       : 200
         StabDelay                     : 100
         CmdexeDelay                   : 25
         SyncLoops                     : 32
         ByteDelay                     : 0
         PollIndex                     : 3
         PollValue                     : 0x53
         Memory Detail                 :

                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           eeprom        65     6     4    0 no        512    4      0  4000  4500 0xff 0xff
           flash         65     6    32    0 yes      8192   64    128  4500  4500 0xff 0xff
           signature      0     0     0    0 no          3    0      0     0     0 0x00 0x00
           lock           0     0     0    0 no          1    0      0  9000  9000 0x00 0x00
           lfuse          0     0     0    0 no          1    0      0  9000  9000 0x00 0x00
           hfuse          0     0     0    0 no          1    0      0  9000  9000 0x00 0x00
           efuse          0     0     0    0 no          1    0      0  9000  9000 0x00 0x00
           calibration    0     0     0    0 no          1    0      0     0     0 0x00 0x00

         Programmer Type : avrftdi
         Description     : FT232H based module from FTDI and Glyn.com.au

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e930b (probably t85)
avrdude: safemode: lfuse reads as 62
avrdude: safemode: hfuse reads as DF
avrdude: safemode: efuse reads as FE
avrdude: reading lfuse memory:

Reading | ################################################## | 100% 0.00s

avrdude: writing output file "<stdout>"
:01000000629D
:00000001FF

avrdude: safemode: lfuse reads as 62
avrdude: safemode: hfuse reads as DF
avrdude: safemode: efuse reads as FE
avrdude: safemode: Fuses OK (E:FE, H:DF, L:62)

avrdude done.  Thank you.

内存爆掉的问题,定义了一个1024的数组.

avr-size --format=avr --mcu=attiny85 main.elf
AVR Memory Usage
----------------
Device: attiny85

Program:    2454 bytes (30.0% Full)
(.text + .data + .bootloader)

Data:       1043 bytes (203.7% Full)
(.data + .bss + .noinit)

`ATTiny85/Atmega328p`时钟计数器相关

Links:

8位定时器(timer0)

先确认手上的ATtiny85的时钟频是否在运行在8MHz,可以通过读取它的fuse位来判定,这里使用的是-U lfuse:w:0xe2:m -U hfuse:w:0xdf:m -U efuse:w:0xff:m.这里测试用的fuse配置如图

ATTiny85默认时钟频率是8MHz,表示它可以每秒进行8000000次周期开关(高电平,低电平),每一个周期的时间段(time period)是1/8000000s也就是0.000000125s，也就是125ns,而一个16位的定时器(0-65535),在每个时钟周期加一，从0到上65536上溢只需要8.192ms,8位的定时器只需要0.032ms就上溢了。如果我们需要更长时间的定时间隔，那么就需要预分频器对时钟进行分频处理,根据芯片手册14.9.2 TCCR0B - Timer/Counter Control Register B描述，通过设置TCCRB0B寄存器的Bit2:0位，可以进如下预分频

CS02	CS01	CS00	Description
0	0	0	No clock source (Timer/Counter stopped)
0	0	1	clk I/O /(no prescaling)
0	1	0	clk I/O /8 (from prescaler)
0	1	1	clk I/O /64 (from prescaler)
1	0	0	clk I/O /256 (from prescaler)
1	0	1	clk I/O /1024 (from prescaler)
1	1	0	External clock source on T0 pin. Clock on falling edge.
1	1	1	External clock source on T0 pin. Clock on rising edge.

下面这个程序使用8位定时器来延时1秒，每秒翻转一次状态的blink示例。ATtiny85默认是8MHz,每个周期是1/8MHz = 0.125us = 125ns,按1024预频后得到7812.5Hz。也就是说，定时器每隔7812.5Hz加1，换算成时间是1/7812.5 = 0.000128s,8位定时器只能计数到0.000128s * 255 = 0.032639999999999995。

设定T/C0的工作状态为CTC模式，开启T/C0输出比较匹配中断使能位，使用OCR0A比较寄存器保存数值做比较。这里设置OCR0A=250,当TCNT0的值达到250后就会产生比较中断(250 < 255>).中断32次后，也是就约等于1秒钟，并且对PB1的状态进行翻转。

~$ cat timer0.c
#include <avr/io.h>
#include <avr/interrupt.h>

int intr_count = 0;
void setupTimer0() {
  cli();
  // Clear registers
  TCCR0A = 0;
  TCCR0B = 0;

  // 7812.5 Hz (8000000/((0+1)*1024))
  OCR0A = 250; //  0.000128s * 250 = 32ms
  // CTC 比较匹配时清零定时器模式
  TCCR0A |= (1 << WGM01);
  // Prescaler 1024
  TCCR0B |= (1 << CS02) | (1 << CS00);
  // Output Compare Match A Interrupt Enable
  TIMSK |= (1 << OCIE0A);
  sei(); //enabling global interrupt, or  SREG |= 0x80
}

ISR(TIMER0_COMPA_vect) {
  if(intr_count == 31)
  {
    intr_count = 0;
    PORTB^=(1<<PB1); //toggling the LED
  } else intr_count++;
}

int main ()
{
  DDRB = 0b00000010; // enable PB1
  setupTimer0();
  while(1)
  {}
}

ATtiny85的timer1也是一个8bit定时器,但是支持最大14-bit(MAX=16384)的预分频,下面是一个测试。设置比较寄存器的值为248，中断两次逻辑采样得到1.006s的方波。1/488.28125 = 0.002048s,也等于0.000000125s * 16384 = 0.002048s.

~$ cat timer1.c

int intr_count = 0;
void setupTimer1() {
  noInterrupts();
  // Clear registers
  TCNT1 = 0;
  TCCR1 = 0;

  // 488.28125 Hz (8000000/((0+1)*16384))
  OCR1C = 248;
  // interrupt COMPA
  OCR1A = OCR1C;
  // CTC
  TCCR1 |= (1 << CTC1);
  // Prescaler 16384
  TCCR1 |= (1 << CS13) | (1 << CS12) | (1 << CS11) | (1 << CS10);
  // Output Compare Match A Interrupt Enable
  TIMSK |= (1 << OCIE1A);
  sei();
}

int main ()
{
  DDRB = 0b00000010; // enable PB1
  setupTimer0();
  while(1)
  {}
}

ISR(TIMER0_COMPA_vect) {
  if(intr_count == 1)
  {
    intr_count = 0;
    PORTB^=(1<<PB1); //toggling the LED
  } else intr_count++;
}

16位定时器(timer1)

下面这个程序使用ATmega328p的16位定时器1来延时1秒，ATmega328p默认是16MHz,每个周期是1/16MHz = 0.0625us,把它1024的预分频后得到15625Hz,这里的设置与上面ATtiny85雷同,只是这里使用的是16位定时器(0-65535), 下面把比较器设置成15640通过逻辑采样得到一个1s的方波,而使用15625得到是0.9993s的方波。

~$ cat timer1.ino
// AVR Timer CTC Interrupts Calculator
// v. 8
// http://www.arduinoslovakia.eu/application/timer-calculator
// Microcontroller: ATmega328P
// Created: 2022-04-27T14:02:45.452Z

#define ledPin 13

void setupTimer1() {
  noInterrupts();
  // Clear registers
  TCCR1A = 0;
  TCCR1B = 0;
  TCNT1 = 0;

  // 15625 Hz (16000000/((0+1)*1024))
  OCR1A = 15640;
  // CTC
  TCCR1B |= (1 << WGM12);
  // Prescaler 1024
  TCCR1B |= (1 << CS12) | (1 << CS10);
  // Output Compare Match A Interrupt Enable
  TIMSK1 |= (1 << OCIE1A);
  interrupts();
}

void setup() {
  pinMode(ledPin, OUTPUT);
  setupTimer1();
}

void loop() {
}

ISR(TIMER1_COMPA_vect) {
  digitalWrite(ledPin, digitalRead(ledPin) ^ 1);
}

ATmega8-16PU

Link:

Programming ATmega8 Using Arduino IDE
ATMEGA8-16PU
AVR Tutorial - Getting Started: Blinking an LED
下面是使用一块ATmega8-16PU与一块面包板，搭的简单测试

添加`Arduino IDE`支持

open File menu, click on Preferences.Now in Additional Boards Manger URLs, enter the following URL:https://mcudude.github.io/MiniCore/package_MCUdude_MiniCore_index.json
Go to Tools menu and then select Board > Boards Manager,In Boards Manager window, search for MiniCore and then install the latest version.

`ATmega8-16pu`连接`UM232H`

 UM232H              ATmega8-16pu       arduino pin out
AD0 (CK)  <------->  Pin19 PB5 (SCK)   digital pin 13
AD1 (DO)  <------->  Pin17 PB3 (MOSI)  digital pin 11
AD2 (DI)  <------->  Pin18 PB4 (MISO)  digital pin 12
AD3 (CS)  <------->  Pin1  PC6 (RESET)
  GND     <------->  Pin8  GND
  +5v     <------->  Pin7  VCC

读写`fuse`

下面是读取它的fuse设置。

~$ avrdude -c UM232H -P /dev/ttyUSB1 -b 19200 -p m8  -U lfuse:r:-:i -v

avrdude: Version 6.3-20171130
         Copyright (c) 2000-2005 Brian Dean, http://www.bdmicro.com/
         Copyright (c) 2007-2014 Joerg Wunsch

         System wide configuration file is "/etc/avrdude.conf"
         User configuration file is "/home/michael/.avrduderc"
         User configuration file does not exist or is not a regular file, skipping

         Using Port                    : /dev/ttyUSB1
         Using Programmer              : UM232H
         Overriding Baud Rate          : 19200
         AVR Part                      : ATmega8
         Chip Erase delay              : 10000 us
         PAGEL                         : PD7
         BS2                           : PC2
         RESET disposition             : dedicated
         RETRY pulse                   : SCK
         serial program mode           : yes
         parallel program mode         : yes
         Timeout                       : 200
         StabDelay                     : 100
         CmdexeDelay                   : 25
         SyncLoops                     : 32
         ByteDelay                     : 0
         PollIndex                     : 3
         PollValue                     : 0x53
         Memory Detail                 :

                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           eeprom         4    20   128    0 no        512    4      0  9000  9000 0xff 0xff
           flash         33    10    64    0 yes      8192   64    128  4500  4500 0xff 0x00
           lfuse          0     0     0    0 no          1    0      0  2000  2000 0x00 0x00
           hfuse          0     0     0    0 no          1    0      0  2000  2000 0x00 0x00
           lock           0     0     0    0 no          1    0      0  2000  2000 0x00 0x00
           calibration    0     0     0    0 no          4    0      0     0     0 0x00 0x00
           signature      0     0     0    0 no          3    0      0     0     0 0x00 0x00

         Programmer Type : avrftdi
         Description     : FT232H based module from FTDI and Glyn.com.au

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9307 (probably m8)
avrdude: safemode: lfuse reads as 62
avrdude: safemode: hfuse reads as DF
avrdude: reading lfuse memory:

Reading | ################################################## | 100% 0.00s

avrdude: writing output file "<stdout>"
:01000000629D
:00000001FF

avrdude: safemode: lfuse reads as 62
avrdude: safemode: hfuse reads as DF
avrdude: safemode: Fuses OK (E:FF, H:DF, L:62)

avrdude done.  Thank you.

写入fuse的配置，这里是通过AVR® Fuse Calculator配置计算出来的。

~$avrdude -c UM232H -P /dev/ttyUSB1 -b 19200 -p m8   -U lfuse:w:0xd4:m -U hfuse:w:0xc9:m

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9307 (probably m8)
avrdude: reading input file "0xd4"
avrdude: writing lfuse (1 bytes):

Writing | ################################################## | 100% 0.01s

avrdude: 1 bytes of lfuse written
avrdude: verifying lfuse memory against 0xd4:
avrdude: load data lfuse data from input file 0xd4:
avrdude: input file 0xd4 contains 1 bytes
avrdude: reading on-chip lfuse data:

Reading | ################################################## | 100% 0.00s

avrdude: verifying ...
avrdude: 1 bytes of lfuse verified
avrdude: reading input file "0xc9"
avrdude: writing hfuse (1 bytes):

Writing | ################################################## | 100% 0.00s

avrdude: 1 bytes of hfuse written
avrdude: verifying hfuse memory against 0xc9:
avrdude: load data hfuse data from input file 0xc9:
avrdude: input file 0xc9 contains 1 bytes
avrdude: reading on-chip hfuse data:

Reading | ################################################## | 100% 0.00s

avrdude: verifying ...
avrdude: 1 bytes of hfuse verified

avrdude: safemode: Fuses OK (E:FF, H:C9, L:D4)

avrdude done.  Thank you.

基于`Rust`语言开发测试

Links:
- Rust on a Digispark ATtiny85

Prerequistes

1	~$ sudo apt install binutils-avr avr-libc gcc-avr pkg-config avrdude libudev-dev

Install Micronucleus (Optional)

ft2232h Link to Lilytiny Attiny85

 FT2232H              ATTiny85

ADBUS0   <------>   SCK    PB2
ADBUS1   <------>   MOSI   PB0
ADBUS2   <------>   MISO   PB1
ADBUS3   <------>   Reset  PB5
  GND    <------>   GND
  +5V    <------>   VIN

Build and Flash firmware

1
2
3

~$ git clone https://github.com/micronucleus/micronucleus
~$ cd micronucleus
~$ make PROGRAMMER="-c UM232H -P /dev/ttyUSB0 -b 19200"  flash

Build micronucleus flash CLI tool

1
2
3

~$ sudo apt-get install libusb-dev
~$ cd micronucleus/commandline
~$ make && cp micronucleus /usr/bin

Connected to board USB

~$ lsusb -v -s 001:030

Bus 001 Device 030: ID 16d0:0753 MCS Digistump DigiSpark
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0         8
  idVendor           0x16d0 MCS
  idProduct          0x0753 Digistump DigiSpark
  bcdDevice            2.06
  iManufacturer           0
  iProduct                0
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0012
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           0
      bInterfaceClass         0
      bInterfaceSubClass      0
      bInterfaceProtocol      0
      iInterface              0
Device Status:     0xffff
  Self Powered
  Remote Wakeup Enabled
  Test Mode
  Debug Mode

Install Rust env

1	~$ cargo +stable install ravedude

Build blink project

~$ git clone https://github.com/Rahix/avr-hal
~$ cd avr-hal/example/trinket
~$ cargo build
~$ avr-objcopy --output-target=ihex ../../target/avr-attiny85/debug/trinket-simple-pwm.elf ../../target/avr-attiny85/debug/trinket-simple-pwm.hex
~$ micronucleus --timeout 60 --run --no-ansi ../../target/avr-attiny85/debug/trinket-simple-pwm.hex

Build release `Minimzing Rust Binary Size` test

added following lines into the Cargo.toml

[profile.release]
strip = true
opt-level = "z"  # Optimize for size.
lto = true
panic = "abort"
debug = false

Build Release

~$ RUSTFLAGS="-Zlocation-detail=none" cargo build --release
warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
package:   /fullpath//github/AVR/avr-hal/examples/trinket/Cargo.toml
workspace: /fullpath//github/AVR/avr-hal/Cargo.toml
   Compiling compiler_builtins v0.1.98
   Compiling core v0.0.0 (/home/michael/.rustup/toolchains/nightly-2023-08-08-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core)
   Compiling proc-macro2 v1.0.69
   Compiling unicode-ident v1.0.12
   Compiling syn v1.0.109
   Compiling proc-macro-hack v0.5.20+deprecated
   Compiling rustversion v1.0.14
   Compiling paste v1.0.14
   Compiling quote v1.0.33
   Compiling avr-hal-generic v0.1.0 (/fullpath//github/AVR/avr-hal/avr-hal-generic)
   Compiling avr-device-macros v0.5.2
   Compiling ufmt-macros v0.1.1 (https://github.com/Rahix/ufmt.git?rev=12225dc1678e42fecb0e8635bf80f501e24817d9#12225dc1)
   Compiling rustc-std-workspace-core v1.99.0 (/home/michael/.rustup/toolchains/nightly-2023-08-08-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/rustc-std-workspace-core)
   Compiling nb v1.1.0
   Compiling ufmt-write v0.1.0 (https://github.com/Rahix/ufmt.git?rev=12225dc1678e42fecb0e8635bf80f501e24817d9#12225dc1)
   Compiling bare-metal v1.0.0
   Compiling vcell v0.1.3
   Compiling cfg-if v1.0.0
   Compiling void v1.0.2
   Compiling cfg-if v0.1.10
   Compiling embedded-storage v0.2.0
   Compiling panic-halt v0.2.0
   Compiling ufmt v0.1.0 (https://github.com/Rahix/ufmt.git?rev=12225dc1678e42fecb0e8635bf80f501e24817d9#12225dc1)
   Compiling avr-device v0.5.2
   Compiling nb v0.1.3
   Compiling embedded-hal v0.2.7
   Compiling attiny-hal v0.1.0 (/fullpath//github/AVR/avr-hal/mcu/attiny-hal)
   Compiling arduino-hal v0.1.0 (/fullpath//github/AVR/avr-hal/arduino-hal)
   Compiling trinket-examples v0.0.0 (/fullpath//github/AVR/avr-hal/examples/trinket)
WARN rustc_codegen_ssa::back::link Linker does not support -no-pie command line option. Retrying without.
WARN rustc_codegen_ssa::back::link Linker does not support -no-pie command line option. Retrying without.
    Finished release [optimized + debuginfo] target(s) in 6.60s

驱动`NRF24l01`

Links：

nrf24l01+ control with 3 ATtiny85 pins
[nRF24l01 control with 2 MCU pins using time-division duplexed SPI](https://nerdralph.blogspot.com/2015/05/ nrf24l01-control-with-2-mcu-pins-using.html)
Fastest AVR software SPI in the West
nerdralph
ATtiny84/85 SPI Interface Pin Reuse
[ATtiny85 SPI protocol – Master and Slave mode tutorial](https://www.gadgetronicx.com/ attiny85-spi-protocol-master-slave-mode-tutorial/)
NRF24l01正常工作的要寄存器值如下：

CONFIG:    0b
EN_AA:     3e
EN_RXADDR: 01
SETUP_AW:  02
SETUP_RETR:30
RF_CH:     02
RF_SETUP:  05
STATUS:    0e
OBS_TX:    00
TX_ADDR:   71917d6b
CD:        00
RX_PW_P0:  20
RX_PW_P1:  00
RX_PW_P2:  00
RX_PW_P3:  00
RX_PW_P4:  00
RX_PW_P5:  00
FIFO_STAT: 11
DYNPD:     00
FEATURE:   05

驱动`SSD1306`

调试

GDB

逻辑分析仪的问题

Fx2lafw
PulseView
Use a raspberry pi pico (rp2040) as a logic
Raspberry Pi PinOut
连接逻辑分析仪时,不知为何,有时会出现干扰,让程序跑飞,而断开后运行的很正常.

字符相关

AVR-GCC汇编相关

GCC asm Statement

Let’s start with a simple example of reading a value from port D:
1
asm("in %0, %1" : "=r" (value) : "I" (_SFR_IO_ADDR(PORTD)) );
Eachasmstatement is devided by colons into (up to) four parts:
1. The assembler instructions, defined as a single string constant: "in %0, %1"
2. A list of output operands, separated by commas. Our example uses just one: "=r" (value)
3. A comma separated list of input operands. Again our example uses one operand only:
  "I" (_SFR_IO_ADDR(PORTD))
4. Clobbered registers, left empty in our example.
You can write assembler instructions in much the same way as you would write assembler programs. However, registers andconstants are used in a different way if they refer to expressions of your C program. The connection between registersand C operands is specified in the second and third part of the asm instruction, the list of input and output operands,respectively. The general form is

1	asm(code : output operand list : input operand list [: clobber list]);

In the code section, operands are referenced by a percent sign followed by a single digit. %0 refers to the first %1 tothe second operand and so forth. From the above example:

1 2	%0 refers to "=r" (value) and %1 refers to "I" (_SFR_IO_ADDR(PORTD)).

Input and Output Operands

Each input and output operand is described by a constraint string followed by a C expression in parantheses. AVR-GCC 3.3knows the following constraint characters:
Note
- The most up-to-date and detailed information on contraints for the avr can be found in the gcc manual.
- The x register is r27:r26, the y register is r29:r28, and the z register is r31:r30

Constraint	Used for	Range
a	Simple upper registers	r16 to r23
b	Base pointer registers pairs	y, z
d	Upper register	r16 to r31
e	Pointer register pairs	x, y, z
q	Stack pointer register	SPH:SPL
r	Any register	r0 to r31
t	Temporary register	r0
w	Special upper register pairs	r24, r26, r28, r30
x	Pointer register pair X	x (r27:r26)
y	Pointer register pair Y	y (r29:r28)
z	Pointer register pair Z	z (r31:r30)
G	Floating point constant	0.0
I	6-bit positive integer constant	0 to 63
J	6-bit negative integer constant	-63 to 0
K	Integer constant	2
L	Integer constant	0
l	Lower registers	r0 to r15
M	8-bit integer constant	0 to 255
N	Integer constant	-1
O	Integer constant	8, 16, 24
P	Integer constant	1
Q	(GCC >= 4.2.x) A memory address based on Y or Z pointer with displacementa.
R	(GCC >= 4.3.x) Integer constant.	-6 to 5

Mnemonic	Constraints	Mnemonic	Constraints
adc	r,r	add	r,r
adiw	w,I	and	r,r
andi	d,M	asr	r
bclr	I	bld	r,I
brbc	I,label	brbs	I,label
bset	I	bst	r,I
cbi	I,I	cbr	d,I
com	r	cp	r,r
cpc	r,r	cpi	d,M
cpse	r,r	dec	r
elpm	t,z	eor	r,r
in	r,I	inc	r
ld	r,e	ldd	r,b
ldi	d,M	lds	r,label
lpm	t,z	lsl	r
lsr	r	mov	r,r
movw	r,r	mul	r,r
neg	r	or	r,r
ori	d,M	out	I,r
pop	r	push	r
rol	r	ror	r
sbc	r,r	sbci	d,M
sbi	I,I	sbic	I,I
sbiw	w,I	sbr	d,M
sbrc	r,I	sbrs	r,I
ser	d	st	e,r
std	b,r	sts	label,r
sub	r,r	subi	d,M
swap	r

Constraint characters may be prepended by a single constraint modifier. Contraints without a modifier specify read-only operands. Modifiers are:
Modifier Specifies

1
2
3

= 	Write-only operand, usually used for all output operands.
+ 	Read-write operand
& 	Register should be used for output only

comment In assembler programming, the term clobbered registers is used to denote any registers whose value may beoverwritten during the course of executing an instruction or procedure.

谢谢支持

微信二维码:

DevOps运维开发

发表于 2020-12-13 更新于 2022-02-26

`Docker Machine`

链接:

简介

Docker Machine是Docker官方编排（Orchestration）项目之一,负责在多种平台上快速安装Docker环境Docker Machine项目基于Go语言实现,目前在Github上进行维护.用Docker Machine可以批量安装和配置docker host,这个host可以是本地的虚拟机,物理机,也可以是公有云中的云主机.Docker Machine支持在不同的环境下安装配置docker host,包括:
- (1) 常规Linux操作系统.
- (2) 虚拟化平台—VirtualBox,VMWare,Hyper-V,OpenStack.
- (3) 公有云— Amazon Web Services,Microsoft Azure,Google Compute Engine,Digital Ocean等.
Docker Machine为这些环境起了一个统一的名字:provider.对于某个特定的provider,Docker Machine使用相应的driver安装和配置
docker host.

脚本安装

#　安装
~$ base=https://github.com/docker/machine/releases/download/v0.16.0 &&
  curl -L $base/docker-machine-$(uname -s)-$(uname -m) >/tmp/docker-machine &&
  sudo install /tmp/docker-machine /usr/local/bin/docker-machine
#　查看版本
~$ docker-machine version
docker-machine version 0.16.0, build 702c267f
#　列出主机列表
~$ docker-machine ls
NAME   ACTIVE   DRIVER   STATE   URL   SWARM   DOCKER   ERRORS

创建`Machine`

下面命令就会在本机的VirtualBox里创建一个名为default的虚拟机.

$ docker-machine  create -d virtualbox default
Running pre-create checks...
Creating machine...
(default) Copying /home/lcy/.docker/machine/cache/boot2docker.iso to /home/lcy/.docker/machine/machines/default/boot2docker.iso...
(default) Creating VirtualBox VM...
(default) Creating SSH key...
(default) Starting the VM...
(default) Check network to re-create if needed...
(default) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
[...]
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env default
~$ docker-machine create -d virtualbox manager1 &&
> docker-machine create -d virtualbox manager2 &&
> docker-machine create -d virtualbox worker1 &&
> docker-machine create -d virtualbox worker2
~$ docker-machine ls
NAME       ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER     ERRORS
default    -        virtualbox   Running   tcp://192.168.99.100:2376           v18.09.1
manager1   -        virtualbox   Running   tcp://192.168.99.101:2376           v18.09.1
manager2   -        virtualbox   Running   tcp://192.168.99.102:2376           v18.09.1
worker1    -        virtualbox   Running   tcp://192.168.99.103:2376           v18.09.1
worker2    -        virtualbox   Running   tcp://192.168.99.104:2376           v18.09.1

`Docker`端的操作

1 2	# 更新worker1,worker2的docker到最新版本,可以指执行. ~$ docker-machine upgrade worker1 worker2

为了支持ansible管理操作,下面安装一些python环境相关的软件.参考链接.tce的安装目录是docker@manager1:/usr/local/tce.installed,比如:要先删除python旧的安装的命令:docker-machine ssh manager1 "rm -rf /usr/local/tce.installed/python",再用下面命令重新安装.这里使用的bash脚本的for循环串行处理,如果要并行处理,可以参考parallel.

安装`python`环境包

~$  for item in manager1 manager2 worker1 worker2; do docker-machine ssh $item "tce-load -wi python && curl https://bootstrap.pypa.io/get-pip.py | sudo python - && sudo ln -s /usr/local/bin/python /usr/bin/python";done

`Ansible`管理连接

创建主机文件,因为每一个主机的私钥位置不同.所以在hosts.txt中具体指定如下.

~$ cat hosts.txt
[swarm]
192.168.99.100 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/default/id_rsa ansible_python_interpreter=/usr/local/bin/python
192.168.99.101 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/manager1/id_rsa ansible_python_interpreter=/usr/local/bin/python
192.168.99.102 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/manager2/id_rsa ansible_python_interpreter=/usr/local/bin/python
192.168.99.103 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/worker1/id_rsa ansible_python_interpreter=/usr/local/bin/python
192.168.99.104 ansible_ssh_private_key_file=/home/lcy/.docker/machine/machines/worker2/id_rsa ansible_python_interpreter=/usr/local/bin/python

连接测试.

~$ ansible -i hosts.txt all -u docker -m ping
192.168.99.102 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
192.168.99.100 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
192.168.99.103 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
192.168.99.104 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
192.168.99.101 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

配置`Docker Swarm`集群节点

配置`Swarm`管理节点

~$ docker-machine ssh manager1 "docker swarm init --advertise-addr 192.168.99.101"
Swarm initialized: current node (slf4m19dsk0cvo6wcpxjwm10v) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-dsf3agxhdq6prycves2cxg16w 192.168.99.101:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

配置工作节点

把worker1,worker2创建成工作节点,并加入到集群中.

~$ docker-machine ssh worker1 "docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-dsf3agxhdq6prycves2cxg16w 192.168.99.101:2377"
This node joined a swarm as a worker.

~$ docker-machine ssh worker2 "docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-dsf3agxhdq6prycves2cxg16w 192.168.99.101:2377"
This node joined a swarm as a worker.

添加备用管理节点

# 查看加入swarm集群管理节点的token
~$ docker-machine ssh manager1 "docker swarm join-token manager"
To add a manager to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-1pg2xuizss0074mw3xgf59b5t 192.168.99.101:2377
# 把manager2加入swarm管理节点,做为备用候选管理节点
~$ docker-machine ssh manager2 "docker swarm join --token SWMTKN-1-4lwpkgvw8lqn68k20n89qy39uvhdx4u6cznak5zy3q6sp5nlp3-1pg2xuizss0074mw3xgf59b5t 192.168.99.101:2377"
This node joined a swarm as a manager.
# 查看swarm集群节点信息.
~$ docker-machine ssh manager1 "docker node ls"
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
slf4m19dsk0cvo6wcpxjwm10v *   manager1            Ready               Active              Leader              18.09.1
5navxdx9kkvph4elfb2dcuiee     manager2            Ready               Active              Reachable           18.09.1
wsvs8cb6sbroj0wuz09z9vrdj     worker1             Ready               Active                                  18.09.1
qkwz8akr2ef8oe5kddibmglfs     worker2             Ready               Active                                  18.09.1

创建`docker`私有仓库

通过下命令在本机创建一个私有的镜像仓库.

~$ docker run -d -v /data/docker-registry:/var/lib/registry -p 5000:5000 --restart=always --name registry registry
Unable to find image 'registry:latest' locally
latest: Pulling from library/registry
cd784148e348: Pull complete
[...]
Digest: sha256:a54bc9be148764891c44676ce8c44f1e53514c43b1bfbab87b896f4b9f0b5d99
Status: Downloaded newer image for registry:latest
242af2d15586d2d571c46c5edf821ce958cf22139d957e52a6f5d959726957bf

接下来,将本地已有的镜像文件推送到刚才新建的仓库中去.

~$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS                 PORTS                                           NAMES
242af2d15586        registry            "/entrypoint.sh /etc…"   About a minute ago   Up About a minute      0.0.0.0:5000->5000/tcp                          registry
8ee3d7fb435f        redis               "docker-entrypoint.s…"   4 months ago         Up 3 hours             0.0.0.0:6379->6379/tcp                          redis
608b60a022e8        postgres:9.6        "docker-entrypoint.s…"   4 months ago         Up 3 hours             0.0.0.0:5432->5432/tcp                          pg96

# 把本地的postgres:9.6版本的镜像,改成192.168.99.1:5000/postgres:v3标签
~$ docker tag postgres:9.6 192.168.99.1:5000/postgres:v3
docker images
REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
registry                     latest              116995fd6624        5 days ago          25.8MB
127.0.0.1:5000/postgres      9.6                 0178d5af9576        5 months ago        229MB
192.168.99.1:5000/postgres   9.6                 0178d5af9576        5 months ago        229MB
postgres                     9.6                 0178d5af9576        5 months ago        229MB
# 推送镜像.
~$ docker push 192.168.99.1:5000/postgres:v3
The push refers to repository [192.168.99.1:5000/postgres]
10cb36af78fe: Pushed
[...]
v3: digest: sha256:86a7984760c1d36c7c9ebec73706f05d76e7615937a45ae0d110b2112fd5cbfa size: 3245
~$　curl http://192.168.99.1:5000/v2/_catalog
{"repositories":["postgres"]}
~$ curl http://192.168.99.1:5000/v2/postgres/tags/list
{"name":"postgres","tags":["v3"]}

简单的从公网下载镜推进私有库中.

~$ docker pull dockersamples/visualizer
~$ docker tag  dockersamples/visualizer 192.168.99.1:5000/visualizer:v4
~$ docker push 192.168.99.1:5000/visualizer:v4
~$ curl http://192.168.99.1:5000/v2/_catalog
{"repositories":["postgres","visualizer"]}

-　通过 ansible 命令,在４个 hosts 节点中的 docker 启动参数中加入前面创建的私有仓库地址.

1
2
3

~$ ansible -i hosts.txt all -u docker -b  -m lineinfile -a "path=/etc/docker/daemon.json line='{\n\t\t\"insecure-registries\":    [\"192.168.99.1:5000\"]\n}' create=yes"
# 重启所有节点
~$  docker-machine restart manager1 manager2 worker2 worker1

`Docker Service`部署单个集群服务

~$ docker-machine ssh manager1 "docker service create --replicas 4 -p 15432:5432 --name pgsql 192.168.99.1:5000/postgres:v3"
yt4u3vmyq3gp9pc2bgowti1lf
overall progress: 0 out of 4 tasks
[....]
verify: Service converged

~$ docker-machine ssh manager1 "docker service ls"
ID                  NAME                MODE                REPLICAS            IMAGE                           PORTS
yt4u3vmyq3gp        pgsql               replicated          4/4                 192.168.99.1:5000/postgres:v3   *:15432->5432/tcp
~$ docker-machine ssh manager1 "docker service  ps pgsql"
ID                  NAME                IMAGE                           NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
zj09wnhy4utz        pgsql.1             192.168.99.1:5000/postgres:v3   worker1             Running             Running 27 seconds ago
l7feljripf38        pgsql.2             192.168.99.1:5000/postgres:v3   manager2            Running             Running 27 seconds ago
omxdx1cw8c7x        pgsql.3             192.168.99.1:5000/postgres:v3   worker2             Running             Running 27 seconds ago
tkghl3px2l08        pgsql.4             192.168.99.1:5000/postgres:v3   manager1            Running             Running 26 seconds ago

# 登录测试一下
~$ psql -h 192.168.99.101 -p 15432 -U postgres -W
Password for user postgres:
psql (10.6 (Debian 10.6-1.pgdg90+1), server 9.6.11)
Type "help" for help.

`Docker Stack`部署多个集群服务

确保下列的image是在本的仓库中找得到的.确认镜像

~$ docker images | awk '$2 ~/v4/ {print}'
192.168.99.1:5000/nginx        v4                  42b4762643dc        34 hours ago        109MB
192.168.99.1:5000/visualizer   v4                  f6411ebd974c        3 weeks ago         166MB
192.168.99.1:5000/portainer    v4                  a01958db7424        6 weeks ago         72.2MB

docker-compose.yml

version: '3'

services:
  nginx:
    image: 192.168.99.1:5000/nginx:v4
    ports:
      - 8088:80
    deploy:
      mode: replicated
      replicas: 4

  visualizer:
    image: 192.168.99.1:5000/visualizer:v4
    ports:
      - '8080:8080'
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock'
    deploy:
      replicas: 1
      placement:
        constraints: [node.role == manager]

  portainer:
    image: 192.168.99.1:5000/portainer:v4
    ports:
      - '9000:9000'
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock'
    deploy:
      replicas: 1
      placement:
        constraints: [node.role == manager]

创建服务

~$ docker-machine ssh manager1 "docker stack deploy -c docker-compose.yml deploy-demo"
Creating network deploy-demo_default
Creating service deploy-demo_visualizer
Creating service deploy-demo_portainer
Creating service deploy-demo_nginx

用浏览器测试打开如下网页:
- portainer 管理界面 http://192.168.99.101:9000/#/home
- visualizer 管理界面 http://192.168.99.101:8080.visualizer的界面太简陋了,没有什么使用价值.
- nginx 的服务界面,http://192.168.99.101:8088,http://192.168.99.102:8088

管理阿里云`ECS`

阿里云官方驱动

阿里云 Docker Machine 驱动
阿里云 ECS Docker Machine Driver 入门指南
按照上述在本地的VirtualBox创建主机的类比,使用阿里云官方的驱动,需要阿里的的安全秘钥以及对应的Region信息.如果想创建VPC网络,还需要有
VPC ID和VSwitch ID.操作起稍显麻烦,而且--aliyunecs-access-key-id,--aliyunecs-access-key-secret这两个参数权限是相当重要,这个只在以后工作的具体应用中有使用需要时,再去了解它的功能.

`generic`驱动

下面虽然没有阿里云driver但有一个generic driver,可通过ssh管理现有的机器,原则上所有的Linux机器都支持.

~$ docker-machine create --driver generic --generic-ip-address DB001 --generic-ssh-user lcy  --generic-ssh-key $HOME/.ssh/id_rsa   aliyun-machine
Running pre-create checks...
Creating machine...
(aliyun-machine) Importing SSH key...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with debian...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env aliyun-machine
~$ docker-machine env aliyun-machine
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://DB001:2376"
export DOCKER_CERT_PATH="/home/lcy/.docker/machine/machines/aliyun-machine"
export DOCKER_MACHINE_NAME="aliyun-machine"
# Run this command to configure your shell:
# eval $(docker-machine env aliyun-machine)

错误

要在/etc/docker/daemon.json加入{ "insecure-registries": ["192.168.99.1:5000"] }这一行,不然会出现下面的错误.

1 2	~$ docker pull 192.168.99.1:5000/postgres:9.6 Error response from daemon: Get https://192.168.99.1:5000/v2/: http: server gave HTTP response to HTTPS client

ansible连接到boot2docker镜像无python环境的错误

192.168.99.102 | FAILED! => {
    "changed": false,
    "module_stderr": "Shared connection to 192.168.99.102 closed.\r\n",
    "module_stdout": "/bin/sh: /usr/local/bin/python: not found\r\n",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
}

`Ansible`

参考链接:
环境简介
- docker 18.09.0
- ansible 2.7.5
- debian stretch
- postgresql 10.6
- python 3.6.6

安装

1	~$ pip install ansible ansible-lint

命令格式

ansible <host-pattern> [options],是Inventory中定义的主机或主机组,可以为ip,hostname,Inventory中的group组名,具有".","\*".":"等特殊字符的匹配型字符串.<>;表示为必须参数,\[\]表示为可选参数.

比如使用apt模块,root用户,更新系统的命令.后面会讲基于YMAL格式的配置文件.

1	~$ ansible -i ~/.ansible/hosts all -u root -m apt -a "upgrade=yes update_cache=yes cache_valid_time=86400"

下例使用authorized_key模块为用户添加公钥匙,exclusive=True就为替换现有的 key.下例为:在原有的 key 附加新的 key.

1
2
3

~$ ansible -i ~/.ansible/hosts all -u lcy -m authorized_key -a "user=lcy state=present  key='ssh-rsa AAAAB3NzaC1yc...... user@gentoo'"
#　在主机的hosts添加
~$ ansible -i ~/.ansible/hosts all -u lcy -b -m lineinfile -a "path=/etc/hosts  create=yes line='127.0.0.1\tlocalhost\n172.18.127.186\tDB001\n172.18.192.77\tWeb001\n172.18.253.222\tFE001\n172.18.192.76\tDIG001'"

查看系统信息

# 这里是调用command这个模块,运行系统命令.
~$  ansible -i ~/.ansible/hosts Web001 -u root  -m command -a 'lsb_release -a'
120.77.xxx.xx | CHANGED | rc=0 >>
Distributor ID: Debian
Description:    Debian GNU/Linux 9.6 (stretch)
Release:        9.6
Codename:       stretchNo LSB modules are available.

命令工具

`ansible-doc`

使用ansible-doc可以查看所有支持的模块文档,因为我这里使用的是Debian,这里把它相关的包管理模块列出来.

~$ ansible-doc  -l | grep "apt"
apt                                                  Manages apt-packages
apt_key                                              Add or remove an apt key
apt_repository                                       Add and remove APT repositories
apt_rpm                                              apt_rpm package manager
na_ontap_ucadapter                                   NetApp ONTAP UC adapter configuration
nios_naptr_record                                    Configure Infoblox NIOS NAPTR records

# 可以查该模块的参数与使用示例.也可以打开网页文档[apt_module](https://docs.ansible.com/ansible/latest/modules/apt_module.html)
~$ ansible-doc apt
> APT    (/home/lcy/.pyenv/versions/3.6.6/envs/py3dev/lib/python3.6/site-packages/ansible/modules/packaging/os/apt.py)

        Manages \`apt\' packages (such as for Debian/Ubuntu).

OPTIONS (= is mandatory):

- allow_unauthenticated
        Ignore if packages cannot be authenticated. This is useful for bootstrapping environments that manage their own apt-key setup.
        \`allow_unauthenticated\' is only supported with state: \`install\'/\`present\'
        [Default: no]
        type: bool
        version_added: 2.1
[....]

`ansible-playbook`

Ansiable的任务配置文件被称为Playbook,可以称之为”剧本”,Playbook具有编写简单,可定制性高,灵活方便,以及可固化日常所有操作的特点.在下面的docker 安装用一个真实的完整实例演示.

配置

默认配置文件名为ansible.cfg,它可以存在于很多地方,默认是/etc／ansible.cfg与用户目录下的~/.ansible/ansible.cfg.Ad-Hoc,Ansible-playbook,前者是临时命令的执行,后者是Ad-Hoc的集合,相当于是一个脚本.

下面是一个经典的hosts文件,它默认找的位置是/etc/ansible/hosts,这里是用户目录.

~$ cat ~/.ansible/hosts
[Web001]
120.77.xxx.xx

[DB001]
119.23.xx.xxx

[DIG001]
120.78.xx.xxx

[FE001]
112.74.xxx.xx

~$ ansible -i ~/.ansible/hosts all -m ping  -u root
120.77.xxx.xx | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
119.23.xx.xxx | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
112.74.xx.xxx | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
120.78.xxx.xx | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

`ansible-vault`

ansible-vault主要用于配置文件加密,如编写要的Playbook配置文件里包含敏感信息.参考这里

# 加密后的a.yaml打开全是乱码.
~$ ansible-vault encrypt a.yaml
New Vault password:
Confirm New Vault password:
Encryption successful
# 解密a.yaml
~$ ansible-vault decrypt a.yaml
Vault password:
Decryption successful

`ansible-galaxy`

Ansible Galaxy 文档

这个跟三星手机没有任何关系 ,可以把它简单地理解为github或者pip的功能.主要是用来生成,查找,安装一些优秀的Roles.一些优质的Roles可以在这里找到.

~$  ansible-galaxy --help
Usage: ansible-galaxy [delete|import|info|init|install|list|login|remove|search|setup] [--help] [options] ...

Perform various Role related operations.

Options:
  -h, --help            show this help message and exit
  -c, --ignore-certs    Ignore SSL certificate validation errors.
  -s API_SERVER, --server=API_SERVER
                        The API server destination
  -v, --verbose         verbose mode (-vvv for more, -vvvv to enable
                        connection debugging)
  --version             show program's version number and exit

 See 'ansible-galaxy <command> --help' for more information on a specific
command.

安装postgresql Roles,以及它的目录结构.如果要创建自定义的Roles,可以参考这些结构.

~$ ansible-galaxy install geerlingguy.postgresql
- downloading role 'postgresql', owned by geerlingguy
- downloading role from https://github.com/geerlingguy/ansible-role-postgresql/archive/1.4.5.tar.gz
- extracting geerlingguy.postgresql to /home/lcy/.ansible/roles/geerlingguy.postgresql
- geerlingguy.postgresql (1.4.5) was installed successfully
~$ tree /home/lcy/.ansible/roles/geerlingguy.postgresql/
/home/lcy/.ansible/roles/geerlingguy.postgresql/
├── defaults
│   └── main.yml
├── handlers
│   └── main.yml
├── LICENSE
├── meta
│   └── main.yml
├── molecule
│   └── default
│       ├── molecule.yml
│       ├── playbook.yml
│       ├── tests
│       │   └── test_default.py
│       └── yaml-lint.yml
├── README.md
├── tasks
│   ├── configure.yml
│   ├── databases.yml
│   ├── initialize.yml
│   ├── main.yml
│   ├── setup-Debian.yml
│   ├── setup-RedHat.yml
│   ├── users.yml
│   └── variables.yml
├── templates
│   ├── pg_hba.conf.j2
│   └── postgres.sh.j2
└── vars
    ├── Debian-7.yml
    ├── Debian-8.yml
    ├── Debian-9.yml
    ├── RedHat-6.yml
    ├── RedHat-7.yml
    ├── Ubuntu-14.yml
    ├── Ubuntu-16.yml
    └── Ubuntu-18.yml

9 directories, 27 files

通过`Ansiable`安装`docker`

Serve Static Files by Nginx from Django using Docker
- apt - Manages apt-packages - Get Docker CE for Debian - 这里参照Get Docker CE for Debian,把它的安装流程转换成一个Playbook文件来执行.这种安装方式,各个被安装的目标主机间的 docker 是相互独立的,如果要把它们集合起来做编排,要使用Docker Machine.

---
- name: 安装基础软件
  hosts: all
  become: yes
  # user: root 这里可以直接用root,但是关闭root远程登录后要使用sudo.
  tasks:
    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_module.html
    - name: 更新并安装
      apt:
        name: ['apt-transport-https', 'ca-certificates', 'curl', 'software-properties-common', tmux]
        allow_unauthenticated: yes
        update_cache: yes

    # 这是另一种安装软件列表的方式
    - name: Install a list of packages
      apt:
        name: '{{ packages }}'
        update_cache: yes
      vars:
        packages:
          - git
          - rsync
          - gcc
          - dirmngr
          - bwn-ng
          - tmux
          - tree

    - name: 更新并安装 gnupg2
      apt:
        name: gnupg2
        allow_unauthenticated: no
        update_cache: yes

    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_key_module.html?highlight=apt%20key
    - name: 添加新的公钥
      apt_key:
        url: https://download.docker.com/linux/debian/gpg
        state: present

    # 参考文档 https://docs.ansible.com/ansible/latest/modules/command_module.html#command-module
    - name: 读取系统发行版本号
      command: lsb_release -sc
      register: result

    # 参照文档 https://docs.ansible.com/ansible/latest/modules/apt_repository_module.html?highlight=add%20apt%20repository
    - apt_repository:
        repo: deb [arch=amd64] https://download.docker.com/linux/debian {{ result.stdout }} stable
        state: present
        filename: docker-ce

    # 安装docker-ce,docker-compose
    - name: 更新并安装 'docker-ce', ,'bridge-utils'
      apt:
        name: ['docker-ce', 'bridge-utils']
        allow_unauthenticated: yes
        update_cache: yes

    - name:
      command: uname -s
      register: vendor

    - name:
      command: uname -m
      register: arch

    # 安装最新版本的docker-compose,使用apt安装的版本很老.
    - name: 安装最新版本的docker-compose-1.23.2
      get_url:
        url: https://github.com/docker/compose/releases/download/1.23.2/docker-compose-{{ vendor.stdout }}-{{ arch.stdout }}
        dest: /usr/local/bin/docker-compose
        mode: 0755

    - name: 重启机器
      reboot:
        reboot_timeout: 3600

重启之后,应该就可以用下面命令行查看docker信息了.

1	~$ ansible -i ~/.ansible/hosts all -u lcy -m command -a "docker info"

注意:如不安装bridge-utils并重启,docker无法启动,可能会出现如下错误.

Dec 29 10:26:16 DB001 dockerd[20493]: time="2018-12-29T10:26:16.760508487+08:00" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
Dec 29 10:26:16 DB001 dockerd[20493]: Error starting daemon: Error initializing network controller: Error creating default "bridge" network: package not installed
Dec 29 10:26:16 DB001 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Dec 29 10:26:16 DB001 systemd[1]: Failed to start Docker Application Container Engine.

添加用户与组

关于与开启用户的sudo功能,有两种方法,一是可以修改远程主机的/etc/sudoers文件.可以参照这里也可以使用复制替换的方法,参照这里.也可以把用户加进sudo组里.

如果要每一个用户添加一个初始密码,可以参照这里.Linux 下可以使用mkpasswd命令,如果其它系统,可以使用passlib这个 python的包来替换.如要使用sudo不用密码,要添加NOPASSWORD关键字.

~$ mkpasswd --method=sha-512
Password:
$6$5QYVZSmH7$FXQcAQ8FsjMVk0x.ATQgpFHhgImp7hdITMh7zAE.VeAkQYDzdFAOxx6jqVFOY.52nRW4a6SjzEUnK.JSh73W61

~$ python -c "from passlib.hash import sha512_crypt; import getpass; print(sha512_crypt.using(rounds=5000).hash(getpass.getpass()))"
Password:
$6$J15B6vXoZeekBlVy$.F6PelDYQRCeqapZ2/V3BQ5IjJXCdhG4g5NgoeNvnGJqf1dValk38IDzBuMfmctLMgQ4llyzVT3WN4pYrIpmZ0

下面是添加两个用户,并加入相关组,以及修改sshd_config的相关参数.

---
- name: 添加用户
  hosts: all
  user: root
  tasks:
    - name: remove users
      user:
        name: '{{ item }}'
        state: absent # absent 表示移除,present 表示添加
        remove: yes
      with_items:
        - lcy
        - gavin_kou

    - name: add the user "{{ item }}"
      # 参考链接 https://docs.ansible.com/ansible/latest/modules/user_module.html?highlight=user
      user:
        name: '{{ item }}'
        append: yes
        groups: docker,sudo
        shell: /bin/bash
        state: present
        generate_ssh_key: yes
        ssh_key_bits: 2048
        # ssh_key_file: .ssh/id_rsa
      # 这里可以使用 with_items来做数组循环. https://docs.ansible.com/ansible/latest/user_guide/playbooks_loops.html?highlight=with_items
      with_items:
        - lcy
        - gavin_kou

    - name: 复制本的公钥到远程用户下.
      # 参考链接 https://docs.ansible.com/ansible/latest/modules/authorized_key_module.html
      authorized_key:
        user: lcy
        state: present
        # exclusive: True
        key: "{{ lookup('file', lookup('env','HOME') + '/.ssh/id_rsa.pub') }}\n"

    - name: Set authorized key for user ubuntu copying it from current user
      # 参考链接 https://docs.ansible.com/ansible/latest/modules/authorized_key_module.html
      authorized_key:
        user: gavin_kou
        state: present
        key: "{{ lookup('file','~/.ansible/gavin.pub') }}\n"
    # 参考链接 https://docs.ansible.com/ansible/latest/modules/lineinfile_module.html?highlight=sudoers
    - lineinfile:
        path: /etc/sudoers
        state: present
        regexp: '^%sudo\s'
        line: '%sudo ALL=(ALL) NOPASSWD: ALL'
        validate: '/usr/sbin/visudo -cf %s'

    # 关闭ssh的root登录功能.所以运行完这个剧本就不能使用root用户执行第二次了.
    - lineinfile:
        path: /etc/ssh/sshd_config
        state: present
        regexp: '^PermitRootLogin\s'
        line: 'PermitRootLogin no'

    - lineinfile:
        path: /etc/ssh/sshd_config
        state: present
        regexp: '^#ClientAliveInterval\s'
        line: 'ClientAliveInterval 30'

    - lineinfile:
        path: /etc/ssh/sshd_config
        state: present
        regexp: '^#ClientAliveCountMax\s'
        line: 'ClientAliveCountMax 3'
    # 参考消息 https://docs.ansible.com/ansible/latest/modules/systemd_module.html
    - name: 重加载sshd
      systemd:
        name: sshd
        state: reloaded

`Docker`的图形界面

docker 哪些平台技术(3)
UI For Docker已经deprecated,还是使用Portainer.

Portainer

1	~$ docker run -d -p 9000:9000 --privileged -v /var/run/docker.sock:/var/run/docker.sock --name web-ui uifd/ui-for-docker

Portainer是一个开源、轻量级Docker管理用户界面,基于Docker API,提供状态显示面板、应用模板快速部署、容器镜像网络数据卷的基本操作（包括上传下载镜像,创建容器等操作）、事件日志显示、容器控制台操作Swarm集群和服务等集中管理和操作、登录用户管理和控制等功能.功能十分全面,基本能满足中小型单位对容器管理的全部需求.
1
~$ docker run -d -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data 　-v /etc/hosts:/etc/hosts --name portainer-ui portainer/portainer

添加其它`Docker`服务器到`portainer`

Protect the Docker daemon socketTLS链接参考这里.
Using the Portainer Agent
portainer添加外部Docker实例有三种模式,一种是dockerd -H tcp://192.168.xx.xx:2376,通过监听TCP socket方式.这里可以用通过TLS来保证安全.第二种就是在目标服务器上安装portainer的agent.以上两种方法,创建TLS证书比较麻烦,安装agent会消耗一些资源.Docker 从18.09开始,客户端可以通过SSH访问:docker -H ssh://me@example.com ps,但是portainer不支持这种协议方式, 只支持两种协议:tcp://,unix://.

下面介绍如何使用TLS证书连接远程的 Docker Engine.

# 创建CA私钥
~$ openssl genrsa -aes256 -out ca.key 4096
#　使和CA私钥创建CA证书.
~$ openssl req -new -x509 -days 365 -key ca.key -sha256 -out ca.pem
#　创建服务器私钥匙
~$　openssl genrsa -out server.key 4096

创建服务器CSR文件.注意CN(Common Name)如果是域名如:www.examples.com,则客户端必须过www.examples.com访问到该服务器.才能验证通过.TLS连接时,需要限制客户端的IP或者域名列表.可以用使用密钥扩展文件支持.如只允许127.0.0.1 和 192.168.1.100客户端访问.echo subhectAltName = IP:127.0.0.1,IP:192.168.1.100 > allowips.cnf,在下面创建服务器证书命令后加上-extfile allowips.cnf就可以支持了.

~$ openssl req -subj "/CN=*" -sha256 -new -key server.key -out server.csr
~$ openssl x509 -req -days 365 -sha256 -in server.csr -CA ca.pem -CAkey ca.key -CAcreateserial -out server.pem
Signature ok
subject=CN = *
Getting CA Private Key
Enter pass phrase for ca.key:

-　创建客户端私钥与CSR

~$ openssl genrsa -out client.key 4096
~$ openssl req -subj "/CN=*" -new -key client.key -out client.csr

#　如果需要客户端的验证就加入下面的扩展.
~$　echo extendedKeyUsage = clientAuth > client.cnf
~$　openssl x509 -req -days 365 -sha256 -in client.csr -CA ca.pem -CAkey ca.key -CAcreateserial -out client.pem -extfile client.cnf

运行服务端与客户端测试.为保证公钥文件安全,需要修改公钥匙文件的访问权限:chmod -v 0400 ca.key server.key client.key.防止证书被篡改:
chmod -v 0444 ca.pem server.pem client.pem.

~$ tree
.
├── ca.key
├── ca.pem
├── ca.srl
├── client.key
├── client.pem
├── server.key
└── server.pem
# 服务端运行
~$ sudo dockerd --tlsverify --tlscacert=ca.pem --tlscert=server.pem --tlskey=server.key -H tcp://0.0.0.0:2376
#　客户端运行.这里必须使用主机名连接,所以上述portainer-ui容器运行,使用　-v /etc/hosts:/etc/hosts　选项.
~$ docker --tlsverify --tlscacert=ca.pem --tlscert=client.pem --tlskey=client.key -H tcp://<主机名>:2376 version
#　Docker 从 18.09 开始支持通过SSH访问,这个更安全更便捷.
~$　docker -H ssh://user@domain.com version
Client:
 Version:           18.09.1
 API version:       1.39
 Go version:        go1.10.6
 Git commit:        4c52b90
 Built:             Wed Jan  9 19:35:59 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.1
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.6
  Git commit:       4c52b90
  Built:            Wed Jan  9 19:02:44 2019
  OS/Arch:          linux/amd64
  Experimental:     false

修改服务端`systemd service`

带证书启动的docker进程可以按照上述的手动运行的形式,也可以通过修改系统的systemed来开启,把服务端的证书文件复制到/etc/docker/下面.方法如下:

~$ cat /lib/systemd/system/docker.service
[...]
#ExecStart=/usr/bin/dockerd -H fd://　　这是原来默认的,只开放本地连接.
ExecStart=/usr/bin/dockerd --tlsverify --tlscacert=/etc/docker/ca.pem --tlscert=/etc/docker/server.pem --tlskey=/etc/docker/server.key -H tcp://0.0.0.0:2376　-H fd://
[...]
~$ sudo systemctl daemon-reload   #重新加载配置文件.

开启客户端默认证书模式

~$ mkdir -pv ~/.docker
~$ cp ca.pem ~/.docker
~$ cp client.key ~/.docker/key.pem # 一定要改这个名字
~$ cp client.pem ~/.docker/cert.pem　# 一定要改这个名字
~$ export DOCKER_HOST=tcp://<HOST>:2376 DOCKER_TLS_VERIFY=1
~$ docker ps 　　#　证书模式连接

`Docker`错误

如果出现如下的错误,要使用systemctl restart docker.service才能解决.

1
2

docker: Error response from daemon: driver failed programming external connectivity on endpoint condescending_lalande (668389b4f87cc892fc233313eb738d0995c4080d3daabf28bf8c9bbe241a5434):  (iptables failed: iptables --wait -t filter -A DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.2 --dport 9000 -j ACCEPT: iptables: No chain/target/match by that name.
 (exit status 1)).

`Ansible`与`Docker`集成使用

操作VOLUME

~$ docker volume create redis_vol
~$ docker volume inspect redis_vol
[
    {
        "CreatedAt": "2019-01-09T17:28:44+08:00",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/redis_vol/_data",
        "Name": "redis_vol",
        "Options": {},
        "Scope": "local"
    }
]
# 删除所有的卷
~$ docker volume prune

必须安装docker-py,下面通过一个稍微复杂一点的例子来说明.

#　如果是python3的话,有些模块会提示要换成安装:　pip3 install docker
~$ pip install docker-py
~$ tree
.
├── main.yaml
├── pgsql
│   ├── file
│   │   └── Dockerfile
│   └── pgsql.yaml
└── redis
    ├── file
    │   └── Dockerfile
    └── redis.yaml

main.yaml

---
- hosts: DB001
  become: yes
  tasks:
    - include: pgsql/pgsql.yaml
    - include: redis/redis.yaml

`Redis`服务器

https://docs.docker.com/samples/library/redis/
https://github.com/docker-library/redis

Dockerfile

# 安装redis服务
FROM debian:stretch
# CMD echo "hello debian from Dockerfile."
# https://www.digitalocean.com/community/tutorials/how-to-install-and-secure-redis-on-debian-9
ENV DEBIAN_FRONTEND noninteractive

RUN sed -i "s/deb.debian.org/mirrors.cloud.aliyuncs.com/g" /etc/apt/sources.list
RUN apt-get update  && DEBIAN_FRONTEND=noninteractive apt-get -y install redis-server
# RUN sed -i 's/^appendonly no$/appendonly yes/g' /etc/redis/redis.conf
# RUN sed -i 's/^daemonize yes$/daemonize no/g' /etc/redis/redis.conf
EXPOSE 6379/tcp
# CMD [ "/etc/init.d/redis-server start" ]
USER root
RUN  rm -rf /data
RUN mkdir /data && chown redis:redis -R  /data
VOLUME ["/data" ]
RUN chown redis:redis -R  /data
CMD ["redis-server","/etc/redis/redis.conf"]

redis.yaml


- name: 创建WORKDIR
  file:
    path: workdir
    state: directory
    recurse: yes

- name: 上传Dockerfile到目标机的/tmp
  # https://docs.ansible.com/ansible/latest/modules/synchronize_module.html?highlight=synchronize
  synchronize:
    src: redis/file/Dockerfile
    dest: workdir

- name: Redis Server
  # 参考地址 https://docs.ansible.com/ansible/latest/modules/docker_container_module.html?highlight=docker
  docker_image:
    name: redis
    tag: v6
    path: workdir
    state: present

- name: 创建VOLUME的目录
  # https://docs.ansible.com/ansible/latest/modules/file_module.html
  become: yes
  file:
    path: ./redis-vol
    state: directory
    # 这里这了避免使用root运行docker,同时也避免使用root运行redis-server,把这个卷目录所有者改成docker里的redis的uid:gid的值,才能满足前述使用.这里为101
    owner: 101
    group: 101
    recurse: yes

- name: create volume
  # 这里使用　docker_volume　这个模块有问题,只能使下面的命令创建卷.创建卷不能指定路径,但是可以使用如: --opt type:ext4 --opt device=/dev/sdX 这个device必须是分区,块设备,不能是目录.
  command: docker volume create redis_vol3

- name: Redis server
  # 参考地址 https://docs.ansible.com/ansible/latest/modules/docker_container_module.html?highlight=docker
  docker_container:
    name: redis-server
    # 对应上面创建的镜像.
    image: 'redis:v6'
    # command: redis-server --apendonly yes
    state: started
    recreate: yes
    user: redis
    # 设置密码登录,1234
    command: redis-server  --requirepass 1234 --appendonly yes --dir /data
    published_ports:
      - '6379:6379'
    volumes:
      - ./redis-vol:/data

测试 Redis 服务.

~$ redis-cli
127.0.0.1:6379> auth 1234
OK
127.0.0.1:6379> set dd 100
OK
127.0.0.1:6379> save  # 写入磁盘.
OK
127.0.0.1:6379> exit

~$ redis-cli
127.0.0.1:6379> auth 1234
OK
127.0.0.1:6379> get dd
"100"
127.0.0.1:6379> config get dir　#　查看系统的目录.
1) "dir"
2) "/data"
127.0.0.1:6379> quit

`Postgresql`数据库

比较复杂的实例可以参考docker-postgresql,这个支持ENTRYPOINT参数,主从复制功能.

~$ cat Dockerfile
# 参考这里 https://docs.docker.com/engine/examples/postgresql_service/#install-postgresql-on-docker
FROM debian:stretch
MAINTAINER lcy

ENV DEBIAN_FRONTEND noninteractive
# 添加这一行是为了避免下面的错误:
# -----------------------------------------------
# debconf: unable to initialize frontend: Dialog
# debconf: (Dialog frontend will not work on a dumb terminal, an emacs shell buffer, or without a controlling terminal.)
# debconf: falling back to frontend: Readline
# debconf: unable to initialize frontend: Readline
# debconf: (Can't locate Term/ReadLine.pm in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.14.2 /usr/local/share/perl/5.14.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.14 /usr/share/perl/5.14 /usr/local/lib/site_perl .) at /usr/share/perl5/Debconf/FrontEnd/Readline.pm line 7, <> line 19.)
# debconf: falling back to frontend: Teletype
# dpkg-preconfigure: unable to re-open stdin:
#------------------------------------------------

# RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
# Add the PostgreSQL PGP key to verify their Debian packages.
# It should be the same key as https://www.postgresql.org/media/keys/ACCC4CF8.asc
# 这里因为部署在阿里云上,所以把软件仓库改成阿里云,加速下载,且免流量.
RUN sed -i "s/deb.debian.org/mirrors.cloud.aliyuncs.com/g" /etc/apt/sources.list

ENV PG_VERSION=10
ENV PG_USER=postgres
ENV PG_HOME=/var/lib/postgresql
ENV PG_RUNDIR=/run/postgresql \
    PG_LOGDIR=/var/log/postgresql \
    PG_DATADIR=${PG_HOME}/${PG_VERSION}/main \
    PG_BINDIR=/usr/lib/postgresql/${PG_VERSION}/bin \
    PG_CONFIG=/etc/postgresql/${PG_VERSION}/main/postgresql.conf

# 下面两个是必需安装才能,才能用apt-key添加公钥
RUN apt-get update  && DEBIAN_FRONTEND=noninteractive apt-get install -y  apt-utils  dirmngr gnupg2

# CMD ["ping","-c","5","p80.pool.sks-keyservers.net"] 这里有可能会运行失败,如:读不到数据,没有解析的域名地址等.
RUN apt-key adv --no-tty --keyserver ipv4.pool.sks-keyservers.net --recv-keys B97B0AFCAA1A47F044F244A07FCC7D46ACCC4CF8

# Add PostgreSQL's repository. It contains the most recent stable release
#     of PostgreSQL, ``10``.
RUN echo "deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main" > /etc/apt/sources.list.d/pgdg.list

# Install ``python-software-properties``, ``software-properties-common`` and PostgreSQL 10.6
#  There are some warnings (in red) that show up during the build. You can hide
#  them by prefixing each apt-get statement with DEBIAN_FRONTEND=noninteractive

RUN apt-get update && DEBIAN_FRONTEND=noninteractive  apt-get install -y software-properties-common postgresql-10 postgresql-client-10 postgresql-contrib-10
# 这里要注意文件路径的rwx权限问题.
USER root
RUN mkdir -p ${PG_DATADIR} && chown -R postgres:postgres ${PG_DATADIR}
RUN mkdir -p ${PG_LOGDIR} && chown -R postgres:postgres ${PG_LOGDIR}

# Adjust PostgreSQL configuration so that remote connections to the
# database are possible.
USER postgres
RUN echo "host all  all    0.0.0.0/0  md5" >> /etc/postgresql/10/main/pg_hba.conf

# And add ``listen_addresses`` to ``/etc/postgresql/10.6/main/postgresql.conf``
RUN echo "listen_addresses='*'" >> /etc/postgresql/10/main/postgresql.conf

# Create a PostgreSQL role named ``docker`` with ``docker`` as the password and
# then create a database `docker` owned by the ``docker`` role.
# Note: here we use ``&&\`` to run commands one after the other - the ``\``
#       allows the RUN command to span multiple lines.
RUN  /etc/init.d/postgresql start && psql --command "CREATE USER docker WITH SUPERUSER PASSWORD 'docker';" && createdb -O docker docker

# Expose the PostgreSQL port
EXPOSE 5432/tcp
# Add VOLUMEs to allow backup of config, logs and databases
VOLUME  ["/etc/postgresql", "/var/log/postgresql", "/var/lib/postgresql/10/main"]
# RUN  ${PG_BINDIR}/initdb -D ${PG_DATADIR}

# Set the default command to run when starting the container
# 这里如果是9.6版本那就要写成/usr/lib/postgresql/9.6/bin/postgres.好像10.6就直接如下面写就可以了.
CMD ["/usr/lib/postgresql/10/bin/postgres", "-D", "/var/lib/postgresql/10/main", "-c", "config_file=/etc/postgresql/10/main/postgresql.conf"]

pgsql.yaml


- name: 创建WORKDIR
  file:
    path: workdir
    state: directory
    recurse: yes

- name: 上传Dockerfile到目标机的/tmp
  # https://docs.ansible.com/ansible/latest/modules/synchronize_module.html?highlight=synchronize
  synchronize:
    src: pgsql/file/Dockerfile
    dest: workdir

- name: 编译PostgreSQL Server镜像
  # 参考地址 https://docs.ansible.com/ansible/latest/modules/docker_image_module.html?highlight=docker_image
  # https://www.postgresql.org/download/linux/debian/
  docker_image:
    name: postgresql-10
    tag: v6
    path: workdir
    # dockerfile: file/Dockerfile
    state: present

- name: 创建VOLUME的目录
  # https://docs.ansible.com/ansible/latest/modules/file_module.html
  become: yes
  file:
    path: ./pgsql-vol/10/main
    state: directory
    # 这里这了避免使用root运行docker,同时也避免使用root运行postgresql,把这个卷目录所有者改成docker里的postgres的uid:gid的值,才能满足前述使用.这里为uid=111(postgres) gid=121(postgres) groups=121(postgres),120(ssl-cert)
    owner: postgres
    group: postgres
    recurse: yes

# # https://docs.ansible.com/ansible/latest/modules/docker_volume_module.html?highlight=docker_volume
# - name: 创建数据容器卷
#   docker_volume:
#       name: pgdb_001
#       driver_options:
#           device: ./docker/volume/pgsql-vol

- name: 创建 postgresql 容器
  # 参考地址 https://docs.ansible.com/ansible/latest/modules/docker_container_module.html?highlight=docker
  # 这里可以等同于command模块的命令:  ``command: docker run -p 5432:5432 -d --name pgsql7 --user postgres  -v ~/pgsql-vol:/var/lib/postgresql postgresql-10:v6``
  docker_container:
    name: pgsql
    image: 'postgresql-10:v6'
    # docker 网络部分 https://docs.ansible.com/ansible/latest/modules/docker_network_module.html
    # network_mode: host
    dns_servers:
      - '8.8.8.8'
      - '100.100.2.136'
    published_ports:
      - '5432:5432'
    state: started
    # restart_policy: always
    # detach: no
    user: postgres
    # https://www.katacoda.com/courses/docker/persisting-data-using-volumes
    volumes:
      - ./pgsql-vol:/var/lib/postgresql

`Jenkins`安装

谢谢支持

微信二维码:

STM32与RTOS实实践

发表于 2020-09-26 更新于 2023-10-20

参考链接

开发板简介

The STM32 Nucleo-144 F767ZI boards offer combinations of performance and power that provide an affordable and flexible
way for users to build prototypes and try out new concepts. For compatible boards, the SMPS significantly reduces power
consumption in Run mode.
The Arduino-compatible ST Zio connector expands functionality of the Nucleo open development platform, with a wide choice
of specialized Arduino* Uno V3 shields.
The STM32 Nucleo-144 board does not require any separate probe as it integrates the ST-LINK/V2-1 debugger/programmer.
The STM32 Nucleo-144 board comes with the STM32 comprehensive free software libraries and examples available with the
STM32Cube MCU Package.

Key Features
- STM32 microcontroller in LQFP144 package
- Ethernet compliant with IEEE-802.3-2002 (depending on STM32 support)
- USB OTG or full-speed device (depending on STM32 support)
- 3 user LEDs
- 2 user and reset push-buttons
- 32.768 kHz crystal oscillator
- Board connectors:
  - USB with Micro-AB
  - SWD
  - Ethernet RJ45 (depending on STM32 support)
  - ST Zio connector including Arduino* Uno V3
  - ST morpho
- Flexible power-supply options: ST-LINK USB VBUS or external sources.
- On-board ST-LINK/V2-1 debugger/programmer with USB re-enumeration
- capability: mass storage, virtual COM port and debug port.
- Comprehensive free software libraries and examples available with the STM32Cube MCU package.
根据文档The Cortex ® -M7 with FPU core is binary compatible with the Cortex ® -M4 core.说明 ,Cortex-M7内核比M4性更
高,且二进制兼容.

Zephyr

ST Nucleo F767ZI

系统简介

Zephyr 内核是一个内存占用极低的内核,它主要设计用于资源受限系统:从简单的嵌入式环境传感器、LED 可穿戴设备,到复杂的智能手表、IoT 无线网关.Zephyr 在被设计时就支持多架构,包括ARM Cortex-M,Intel x86,ARC,NIOS II 和 RISC V.使用Zephyr的一大优点是操作系统(OS)和软件开发工具包(SDK)可支持数百个开发板.

安装Zephyr环境

初始化工程

~$ pip3 install west
~$ west init ~/zephyrproject
~$ cd zephyrproject
~$ west update  # 这里会通过west,同步大量的git库.
~$ west zephyr-export

~ zephyrproject$ tree -L 2
.
├── bootloader
│   └── mcuboot
├── modules
│   ├── bsim_hw_models
│   ├── crypto
│   ├── debug
│   ├── fs
│   ├── hal
│   ├── lib
│   └── tee
├── tools
│   ├── ci-tools
│   ├── edtt
│   └── net-tools
└── zephyr
    ├── arch
    ├── boards
    ├── cmake
    ├── CMakeLists.txt
    ├── CODE_OF_CONDUCT.md
    ├── CODEOWNERS
    ├── CONTRIBUTING.rst
    ├── doc
    ├── drivers
    ├── dts
    ├── include
    ├── Kconfig
    ├── Kconfig.zephyr
    ├── kernel
    ├── lib
    ├── LICENSE
    ├── MAINTAINERS.yml
    ├── Makefile
    ├── misc
    ├── modules
    ├── README.rst
    ├── samples
    ├── scripts
    ├── share
    ├── soc
    ├── subsys
    ├── tests
    ├── VERSION
    ├── version.h.in
    ├── west.yml
    ├── zephyr-env.cmd
    └── zephyr-env.sh

32 directories, 15 files

安装大量的Python依赖库

1	~ zephyrproject$ pip install -r ./zephyr/scripts/requirements.txt

安装工具链

最新的发行版下载最新版

1
2
3

~$ wget https://github.com/zephyrproject-rtos/sdk-ng/releases/download/v0.11.4/zephyr-sdk-0.11.4-setup.run
~$ chmod +x zephyr-sdk-0.11.4-setup.run
~$ ./zephyr-sdk-0.11.4-setup.run -- -d ~/zephyr-sdk-0.11.4

解压安装完成后,会有一个~/.zephyrrc的环境变量文件.下面安装udev文件.

1 2	~$ sudo cp ~/zephyr-sdk-0.11.4/sysroots/x86_64-pokysdk-linux/usr/share/openocd/contrib/60-openocd.rules /etc/udev/rules.d/ ~$ sudo udevadm control --reload

板级示例程序

~$  west build -p auto  -b nucleo_f767zi samples/hello_world
[......]
-- west build: building application
[1/131] Preparing syscall dependency handling

[126/131] Linking C executable zephyr/zephyr_prebuilt.elf
Memory region         Used Size  Region Size  %age Used
           FLASH:       13448 B         2 MB      0.64%
            DTCM:          0 GB       128 KB      0.00%
            SRAM:        4432 B       384 KB      1.13%
        IDT_LIST:         200 B         2 KB      9.77%
[131/131] Linking C executable zephyr/zephyr.elf
# 查看编译目录的文件结构.
~$ ls  build/zephyr/
arch                 drivers       kconfig      linker.cmd.dep              nucleo_f767zi.dts.pre.d    zephyr.bin  zephyr.map
boards               edt.pickle    kernel       linker_pass_final.cmd       nucleo_f767zi.dts.pre.tmp  zephyr.dts  zephyr_prebuilt.elf
cmake                include       lib          linker_pass_final.cmd.dep   runners.yaml               zephyr.elf  zephyr_prebuilt.map
CMakeFiles           isrList.bin   libzephyr.a  misc                        soc                        zephyr.hex  zephyr.stat
cmake_install.cmake  isr_tables.c  linker.cmd   nucleo_f767zi.dts_compiled  subsys                     zephyr.lst

连接到目标板子的USB,使用下面命令烧写入示例.

~$ west flash
Open On-Chip Debugger 0.10.0+dev-01341-g580d06d9d-dirty (2020-06-25-12:07)
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.org/doc/doxygen/bugs.html
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
Info : clock speed 2000 kHz
Info : STLINK V2J23M7 (API v2) VID:PID 0483:374B
Info : Target voltage: 3.245850
Info : stm32f7x.cpu: hardware has 8 breakpoints, 4 watchpoints
Info : Listening on port 3333 for gdb connections
    TargetName         Type       Endian TapName            State
--  ------------------ ---------- ------ ------------------ ------------
 0* stm32f7x.cpu       hla_target little stm32f7x.cpu       running

Info : Unable to match requested speed 2000 kHz, using 1800 kHz
Info : Unable to match requested speed 2000 kHz, using 1800 kHz
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08000268 msp: 0x20020e8c
Info : device id = 0x10006451
Info : flash size = 2048 kbytes
Info : Single Bank 2048 kiB STM32F76x/77x found
auto erase enabled
wrote 32768 bytes from file /fullpath/zephyrproject/zephyr/build/zephyr/zephyr.hex in 1.283166s (24.938 KiB/s)

通过USB连接的/dev/ttyACM0会看到如下输出.

1
2
3

~$ sudo minicom -o -b 115200 -D /dev/ttyACM0
*** Booting Zephyr OS build v2.4.0-rc3-10-g0a0cb52fb229  ***
Hello World! nucleo_f767zi

板级调试

~$ west debug
-- west debug: rebuilding
[0/1] cd /fullpath/zephyrproject/zephyr/build/zephyr/cmake/flash && /usr/bin/cmake -E echo

-- west debug: using runner openocd
/fullpath/zephyr-sdk-0.11.4/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb: error while loading shared libraries: libpython3.8.so.1.0: cannot open shared object file: No such file or directory
FATAL ERROR: command exited with status 127: /fullpath/zephyr-sdk-0.11.4/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb -ex 'target remote :3333' /fullpath/zephyrproject/zephyr/build/zephyr/zephyr.elf

如上面错误,是因为这里是使用pyenv安装的python-3.8.2,不在系统内的搜索路径里,而且默认还是静态(static)编译的,所以会出显上面的错误.下面来重装它,并且使用LD_LIBRARY_PATH指定路径.

~$ CONFIGURE_OPTS=--enable-shared pyenv install 3.8.2
pyenv: /home/michael/.pyenv/versions/3.8.2 already exists
continue with installation? (y/N) y
Installing Python-3.8.2...
Installed Python-3.8.2 to /home/michael/.pyenv/versions/3.8.2

~$ tree -L 1 /home/michael/.pyenv/versions/3.8.2/lib
/home/michael/.pyenv/versions/3.8.2/lib
├── libpython3.8.a
├── libpython3.8.so -> libpython3.8.so.1.0
├── libpython3.8.so.1.0
├── libpython3.so
├── pkgconfig
└── python3.8

2 directories, 4 files

再次运行调试.

~$ LD_LIBRARY_PATH=/home/michael/.pyenv/versions/3.8.2/lib west debug
-- west debug: rebuilding
[....]
This GDB was configured as "--host=x86_64-build_pc-linux-gnu --target=arm-zephyr-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
[...]
Reading symbols from /fullpath/zephyrproject/zephyr/build/zephyr/zephyr.elf...
Info : stm32f7x.cpu: hardware has 0 breakpoints, 10 watchpoints
Info : Listening on port 3333 for gdb connections
    TargetName         Type       Endian TapName            State
--  ------------------ ---------- ------ ------------------ ------------
 0* stm32f7x.cpu       hla_target little stm32f7x.cpu       halted

Info : Listening on port 6333 for tcl connections
Info : Listening on port 4444 for telnet connections
Remote debugging using :3333
Info : accepting 'gdb' connection on tcp/3333
Debugger attaching: halting execution
Info : Unable to match requested speed 2000 kHz, using 1800 kHz
Info : Unable to match requested speed 2000 kHz, using 1800 kHz
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08001004 msp: 0x20020810
force hard breakpoints
Info : device id = 0x10006451
Info : flash size = 2048 kbytes
Info : Single Bank 2048 kiB STM32F76x/77x found
Info : flash size = 1024 bytes
0x08001004 in z_vprintk (out=0x0, ctx=0x0, fmt=0x0, ap=...) at /fullpath/zephyrproject/zephyr/lib/os/printk.c:292
292					while (*s) {
(gdb)

测试非标准板子(STM32F030 DEMO BOARD)

STM32F030 DEMO BOARD
我手里的这块stm32f030_demo是没有STLINK-V2调试器的,但是它有引出GND,VCC,DIO,CLK这样一组接口,刚才好手上有一个JLink-OB也是这样的接口,下面就用它来烧写与调试.

1	~$ west build -b stm32f030_demo samples/basic/blinky

配置openocd连接参数.

~$ cat > jlink-ob-stm32f0.cfg<<EOF
source [find interface/jlink.cfg]
transport select swd
source [find target/stm32f0x.cfg]
EOF

使用openocd烧写.

1 2	~$ cd build/zephyr ~$ openocd -f ~/jlink-ob-stm32f0.cfg -c init -c "reset halt" -c "stm32f0x mass_erase 0" -c "flash write_bank 0 zephyr.bin 0" -c "reset run"

Nuttx

系统简介

NuttX是一个专注于标准合规和小内存占用的实时操作系统(RTOS).它可以在8位到32位的微控制器上部署.NuttX在编写时主要参照了POSIX和ANSI标准.对于那些标准中没有的部分,如fork()等,则参考了VxWorks或其他RTOS.NuttX基本上完全是由C语言实现的,并且通过Kconfig生成GNU makefile.NuttX的发行版包括了NuttX内核本身和相当一部分的中间件和板级支持包.NuttX的内核和绝大多数代码完全是由Gregory Nutt完成的,并由他专门维护.所有的社区贡献都必须经过他批准.NuttX最早是在2007年由Gregory Nutt于BSD协议下释出的.

使用`BuildRoot`构建工具链(非必要)

如果要使用BuildRoot定制编译的工具链就是如下配置:

~$ git clone https://bitbucket.org/nuttx/buildroot.git buildroot
~$ cp configs/cortexm7f-eabi-defconfig-7.4.0 .config
~$ make menuconfig
# 终配置如下
~$ grep -v '^$\|^#' .config
BR2_HAVE_DOT_CONFIG=y
BR2_arm=y
BR2_cortex_m7f=y
BR2_GCC_CORTEX=y
BR2_GCC_CORTEX_M7F=y
BR2_ARM_EABI=y
BR2_ARCH="arm"
BR2_GCC_TARGET_TUNE="cortex-m7"
BR2_GCC_TARGET_ARCH="armv7-m"
BR2_GCC_TARGET_ABI="aapcs-linux"
BR2_WGET="wget --passive-ftp"
BR2_SVN="svn co"
BR2_ZCAT="zcat"
BR2_BZCAT="bzcat"
BR2_TAR_OPTIONS=""
BR2_DL_DIR="$(BASE_DIR)/../archives"
BR2_STAGING_DIR="$(BUILD_DIR)/staging_dir"
BR2_NUTTX_DIR="$(TOPDIR)/../nuttx"
BR2_TOPDIR_PREFIX=""
BR2_TOPDIR_SUFFIX=""
BR2_GNU_BUILD_SUFFIX="pc-elf"
BR2_GNU_TARGET_SUFFIX="nuttx-eabi"
BR2_PACKAGE_BINUTILS=y
BR2_BINUTILS_VERSION_2_28_1=y
BR2_BINUTILS_SUPPORTS_NUTTX_OS=y
BR2_BINUTILS_VERSION="2.28.1"
BR2_EXTRA_BINUTILS_CONFIG_OPTIONS=""
BR2_PACKAGE_GCC=y
BR2_GCC_VERSION_7_4_0=y
BR2_GCC_SUPPORTS_SYSROOT=y
BR2_GCC_SUPPORTS_NUTTX_OS=y
BR2_GCC_SUPPORTS_DOWN_PREREQ=y
BR2_GCC_DOWNLOAD_PREREQUISITES=y
BR2_GCC_VERSION="7.4.0"
BR2_EXTRA_GCC_CONFIG_OPTIONS=""
BR2_INSTALL_LIBSTDCPP=y
BR2_PACKAGE_GDB_HOST=y
BR2_GDB_VERSION_8_0_1=y
BR2_PACKAGE_GDB_TUI=y
BR2_GDB_VERSION="8.0.1"
BR2_PACKAGE_NXFLAT=y
BR2_PACKAGE_GENROMFS=y
BR2_PACKAGE_KCONFIG_FRONTENDS=y
BR2_KCONFIG_VERSION_4_11_0_1=y
BR2_KCONFIG_FRONTENDS_VERSION="4.11.0.1"
BR2_LARGEFILE=y
BR2_SOFT_FLOAT=y
BR2_TARGET_OPTIMIZATION="-Os -pipe"

编译成功后,会在buildroot生成一个目录,结构如下:

~$ tree -L 2 build_arm_hf/
build_arm_hf/
├── root
└── staging_dir
    ├── arm-elf -> arm-nuttx-eabi
    ├── arm-nuttx-eabi
    ├── bin
    ├── include
    ├── lib
    ├── libexec
    ├── share
    └── usr

10 directories, 0 files

编译

nuttx Wiki
把上面工具链的绝对路径加入Shell环境变量中使用,也可以使用第三方的工具链,如:Zephyr的arm-zephyr-eabi,也可以使用如:gcc-arm-none-eabi-6-2017-q2-update的工具链,在Debian下可以安装系统自带的gcc-arm-none-eabi包.在make menuconfig时,选择CONFIG_ARMV7M_TOOLCHAIN_GNU_EABIL=y.

~$ tools/configure.sh -L | grep "f767"
  nucleo-144:f767-netnsh
  nucleo-144:f767-nsh
  nucleo-144:f767-evalos

因为NULEO-F767ZI是有一个以太网接口,这里选择使用nucleo-144:f767-netnsh配置文件.

~$ tools/configure.sh  nucleo-144:f767-netnsh
~$ make oldconfig
~$ make menuconfig
~$ make   # 如果是使用第三方工具链,就如这样: make CROSSDEV=arm-none-eabi-

这里最终测试的系统配置如下:

ESP8266通信

这里是使用板上的USART6通信,而USART3默认是做为系统的CONSOLE的接口,

CONFIG_STM32F7_USART6=y
CONFIG_USART6_SERIALDRIVER=y
CONFIG_DEV_CONSOLE=y
CONFIG_NSH_CONSOLE=y
CONFIG_SERIAL_CONSOLE=y
CONFIG_USART3_SERIAL_CONSOLE=y
CONFIG_NUCLEO_CONSOLE_VIRTUAL=y
CONFIG_NETUTILS_ESP8266_DEV_PATH="/dev/ttyS1"

# 安装CU命令,就是类似于putty,minicom这样的串口通信工具.
CONFIG_SYSTEM_CUTERM=y
CONFIG_SYSTEM_CUTERM_DEFAULT_DEVICE="/dev/ttyS0"
CONFIG_SYSTEM_CUTERM_DEFAULT_BAUD=115200
CONFIG_SYSTEM_CUTERM_STACKSIZE=2048
CONFIG_SYSTEM_CUTERM_PRIORITY=100

同时需要在boards/arm/stm32f7/nucleo-144/include/board.h添加USART6的接口定义.

[...]
#  define GPIO_USART6_RX GPIO_USART6_RX_2
#  define GPIO_USART6_TX GPIO_USART6_TX_2
[...]

连接ESP8266

STM32F767ZI-NUCLEO          ESP01

         D0 RX     --->      TX
         D1 TX     --->      RX
         GND       --->      GND
         3V3       --->      3V3
         3V3       --->      CH_PD # 拉高进入正常模式

连接测试

nsh> cu -s 115200 -l /dev/ttyS1
AT

OK
AT+GMR
AT version:1.7.4.0(May 11 2020 19:13:04)
SDK version:3.0.4(9532ceb)
compile time:May 27 2020 10:12:17
Bin version(Wroom 02):1.7.4
OK

AT+SYSRAM?
+SYSRAM:51952

OK

外接SPI-SD读卡器

这里使用SPI3:(PB3:CLK),(PB4:MISO),(PB5:MOSI)来外接SD读卡器,根据数据手册stm32f767zi.pdf,NSS(CS)片选可以是软件定义也可以使它硬件(PA4:GPIO_SPI3_NSS_2)定义的,因为这里的使用场景非常简单,就是用PA4来做从机的CS片选.
nuttx的系统内boards/arm/stm32f7/nucleo-144/src没有mmcsd相关的文件,这里可以有从其它板子的
stm32_mmcsd.c文件复制修改一下就可以使用.
在boards/arm/stm32f7/nucleo-144/src/nucle-144.h定义CS.

[...]
#define GPIO_SPI_CS (GPIO_OUTPUT | GPIO_PUSHPULL | GPIO_SPEED_50MHz | \
                     GPIO_OUTPUT_SET)
#if defined(CONFIG_MMCSD_SPI)
#define GPIO_SPI3_CARD_CS (GPIO_SPI_CS | GPIO_PORTA | GPIO_PIN4)
#endif
[...]

在boards/arm/stm32f7/nucleo-144/src/stm32_mmcsd.c内定义函数的实现.

[....]
#ifdef CONFIG_STM32F7_SPI3
int stm32_spi3register(struct spi_dev_s *dev, spi_mediachange_t callback,
                       void *arg)
{
  /* TODO: media change callback */
  return OK;
}
#endif

/*****************************************************************************
 * Name: stm32_mmcsd_initialize
 *
 * Description:
 *   Initialize SPI-based SD card and card detect thread.
 ****************************************************************************/

int stm32_mmcsd_initialize(int port, int minor)
{
  struct spi_dev_s *spi;
  int rv;

  stm32_configgpio(GPIO_SPI3_CARD_CS);   /* Assign CS */
  stm32_gpiowrite(GPIO_SPI3_CARD_CS, 1); /* Ensure the CS is inactive */

  mcinfo("INFO: Initializing mmcsd port %d minor %d \n",
         port, minor);

  spi = stm32_spibus_initialize(port);
  if (spi == NULL)
  {
    mcerr("ERROR: Failed to initialize SPI port %d\n", port);
    return -ENODEV;
  }

  rv = mmcsd_spislotinitialize(minor, minor, spi);
  if (rv < 0)
  {
    mcerr("ERROR: Failed to bind SPI port %d to SD slot %d\n",
          port, minor);
    return rv;
  }

  spiinfo("INFO: mmcsd card has been initialized successfully\n");
  return OK;
}

修改文件boards/arm/stm32f7/nucleo-144/src/stm32_spi.c内的相关函数内容.

[...]
#ifdef CONFIG_STM32F7_SPI3
void stm32_spi3select(FAR struct spi_dev_s *dev, uint32_t devid, bool selected)
{
  spiinfo("devid: %d CS: %s\n", (int)devid, selected ? "assert" : "de-assert");
#if defined(CONFIG_MMCSD_SPI)
  stm32_gpiowrite(GPIO_SPI3_CARD_CS, !selected);
#endif
}

uint8_t stm32_spi3status(FAR struct spi_dev_s *dev, uint32_t devid)
{
  uint8_t ret = 0;
#if defined(CONFIG_MMCSD_SPI)
  if (devid == SPIDEV_MMCSD(0))
  {
    /* Note: SD_DET is pulled high when there\'s no SD card present. */
    /* 因为读卡没CD脚(插卡检测),或者不用该功能,就必须返回卡已经插入的条件.
     * 凭此条件去进行下一步,读卡写卡操作,很重要. */

    ret |= SPI_STATUS_PRESENT;
  }
#endif
  return ret;
}

在系统初始化的流程内加入MMCSD_SPI的初始流程.修改文件boards/arm/stm32f7/nucleo-144/src/stm32_appinitialize.c.

[...]
#ifdef CONFIG_MMCSD_SPI
  /* Initialize the MMC/SD SPI driver (SPI2 is used) */

  ret = stm32_mmcsd_initialize(3, CONFIG_NSH_MMCSDMINOR);
  if (ret < 0)
  {
    syslog(LOG_ERR, "Failed to initialize SD slot %d: %d\n",
           CONFIG_NSH_MMCSDMINOR, ret);
  }
#endif
[....]

最后是修改boards/arm/stm32f7/nucleo-144/src/Makefile文件,加入如下内容.
1
2
3
4
5
[...]
ifeq ($(CONFIG_MMCSD_SPI),y)
CSRCS += stm32_mmcsd.c
endif
[...]

QSPI驱动(未测试成功)

不知为何,工程文件内没有定义QSPI的管脚.这里只是通过编译,还未成功读写它.

~$ grep "/* QSPI"  boards/arm/stm32f7/nucleo-144/include/board.h -A 20
/* QSPI
 *
 * reference from UM1974 chapter 6.14
 *  stm32f7/hardware/stm32f76xx77xx_pinmap.h
 *
 * PB6   GPIO_QSPI_CS   CN10-13
 * PB2   GPIO_QSPI_SCK  CN10-15
 * PD11  GPIO_QSPI_IO0  CN10-23
 * PD12  GPIO_QSPI_IO1  CN10-21
 * PE2   GPIO_QSPI_IO2  CN10-25
 * PD13  GPIO_QSPI_IO3  CN10-19
 *
 * */
#define GPIO_QSPI_CS   GPIO_QUADSPI_BK1_NCS_1
#define GPIO_QSPI_SCK  GPIO_QUADSPI_CLK_1
#define GPIO_QSPI_IO0  GPIO_QUADSPI_BK1_IO0_3
#define GPIO_QSPI_IO1  GPIO_QUADSPI_BK1_IO1_3
#define GPIO_QSPI_IO2  GPIO_QUADSPI_BK1_IO2_1
#define GPIO_QSPI_IO3  GPIO_QUADSPI_BK1_IO3_2

烧写与测试

编译成功后nuttx内会有nuttx,nuttx.bin,nuttx.hex三个文件,通过openocd烧写与调试目标板子.

1	~$ sudo openocd -f board/stm32f7discovery.cfg -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000"

连接网卡与USB接口,使用minicom连接串口,进入系统.有时引同会长时间无法初始化,尝试按一下板子上的B1 USER按键.

~$ minicom -o -b 115200 -D /dev/ttyACM0

nsh> help
help usage:  help [-v] [<cmd>]

  .         cat       df        hexdump   mkdir     mw        set       umount
  [         cd        echo      ifconfig  mkfatfs   nslookup  sleep     unset
  ?         cp        exec      ifdown    mkfifo    ps        source    usleep
  addroute  cmp       exitifup      mkrd      pwd       test      wget
  arp       dirname   false     kill      mh        rm        time      xd
  basename  dd        free      ls        mount     rmdir     true
  break     delroute  help      mb        mv        route     uname

Builtin Apps:
  ping6      renew      ntpcstop   sh
  ntpcstart  ping       mm         nsh

Iotjs

Build for STM32F4 NuttX
这里是参照STM32F4来实践的.把iotjs同步到与nuttx,apps同一级目录.

~$ git clone https://github.com/Samsung/iotjs.git
~$ ls
apps  buildroot  iotjs  nuttx
# 须要先构建它.
~$ cd iotjs && tools/build.py  --target-arch=arm --target-os=nuttx --nuttx-home=/fullpath/nuttx --target-board=stm32f7nucleo --jerry-heaplimit=78
==> Initialize submodule

git submodule init

Submodule 'deps/http-parser' (https://github.com/Samsung/http-parser.git) registered for path 'deps/http-parser'
Submodule 'deps/jerry' (https://github.com/jerryscript-project/jerryscript.git) registered for path 'deps/jerry'
Submodule 'deps/libtuv' (https://github.com/Samsung/libtuv.git) registered for path 'deps/libtuv'
Submodule 'deps/mbedtls' (https://github.com/ARMmbed/mbedtls.git) registered for path 'deps/mbedtls'
git submodule update

编译中出现下面的错误,是因为头文件里的一些条件宏定义的结果,也就是说在nuttx系统内没有选择CONFIG_SERIAL_TERMIOS=y,所以才会导至error: field 'orig_termios' has incomplete type.

In file included from /fullpath/iotjs/deps/libtuv/include/uv.h:77:0,
                 from /fullpath/iotjs/deps/libtuv/src/fs-poll.c:22:
/fullpath/iotjs/deps/libtuv/include/uv-unix.h:428:18: error: field 'orig_termios' has incomplete type
   struct termios orig_termios;                                                \
                  ^
/fullpath/iotjs/deps/libtuv/include/uv.h:680:3: note: in expansion of macro 'UV_TTY_PRIVATE_FIELDS'
   UV_TTY_PRIVATE_FIELDS
   ^
make[5]: *** [CMakeFiles/tuv.dir/build.make:63: CMakeFiles/tuv.dir/src/fs-poll.c.obj] Error 1
make[4]: *** [CMakeFiles/Makefile2:73: CMakeFiles/tuv.dir/all] Error 2
make[3]: *** [Makefile:130: all] Error 2
make[2]: *** [CMakeFiles/libtuv.dir/build.make:111: deps/libtuv/src/libtuv-stamp/libtuv-build] Error 2
make[1]: *** [CMakeFiles/Makefile2:184: CMakeFiles/libtuv.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

在apps/system内,创建一个iotjs的目录,具的app内容是来自于STM32F4下面的.

1 2	~$ mkdir apps/system/iotjs ~$ cp iotjs/config/nuttx/stm32f4dis/app/* apps/system/iotjs

必须要在app/system/Kconfig加上一条source "/<fullpath>/apps/system/iotjs/Kconfig"记录.

使用串口与ESP8266通信

在使用ESP8266时,要注意配置几个必须项,这里使用板上的UART4来通信.

1
2
3

-> System Type -> STM32 Peripheral Support -> [*] UART4
-> Application Configuration -> Network Utilities -> [*] ESP8266
-> Application Configuration -> System Libraries and NSH Add-Ons -> [*] CU minimal serial terminal

UART4定义缺失的错误

CC:  chip/stm32_serial.c
chip/stm32_serial.c:1037:20: error: 'GPIO_UART4_TX' undeclared here (not in a function)
   .tx_gpio       = GPIO_UART4_TX,
                    ^
chip/stm32_serial.c:1038:20: error: 'GPIO_UART4_RX' undeclared here (not in a function)
   .rx_gpio       = GPIO_UART4_RX,
                    ^
make[1]: *** [Makefile:154: stm32_serial.o] Error 1

STM32F103最小系统

这里使用的最小板系统,标准JTAG与大部分管脚能引出来,清晰明了,这里重点参照nuttx板子内的README.md.但是这里在使用USB开启CDC/ACM串口时没成功.

1	~$ tools/configure.sh stm32f103-minimum:usbnsh

看了这个系统内的文档说明,STM32F103C8T6的内部flash是128KB而不是它数据文档所说的64KB,所下这里修改ld文件如下:

~$ cat boards/arm/stm32/stm32f103-minimum/scripts/ld.script
[....]
/* The STM32F103C8T6 has 64Kb of FLASH beginning at address 0x0800:0000 and
 * 20Kb of SRAM beginning at address 0x2000:0000.  When booting from FLASH,
 * FLASH memory is aliased to address 0x0000:0000 where the code expects to
 * begin execution by jumping to the entry point in the 0x0800:0000 address
 * range.
 *
 * NOTE: While the STM32F103C8T6 states that the part has 64Kb of FLASH,
 * all parts that I have seen do, in fact, have 128Kb of FLASH.  That
 * additional 64Kb of FLASH can be utilized by simply change the LENGTH
 * of the flash region from 64K to 128K.
 */

MEMORY
{
  flash (rx) : ORIGIN = 0x08000000, LENGTH = 128K
  sram (rwx) : ORIGIN = 0x20000000, LENGTH = 20K
}
[...]

配置openocd通过Jlink-OB来烧写调试.

~$ cat ~/jlink-ob-stm32f1-swd.cfg
source [find interface/jlink.cfg]
transport select swd
source [find target/stm32f1x.cfg]

# 烧写
~$ openocd -f ~/jlink-ob-stm32f1-swd.cfg -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000" -c "reset run"
Open On-Chip Debugger 0.10.0+dev-01423-g3ffa14b04-dirty (2020-10-14-08:59)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Info : J-Link ARM-OB STM32 compiled Aug 22 2012 19:52:04
Info : Hardware version: 7.00
Info : VTarget = 3.300 V
Info : clock speed 1000 kHz
Info : SWD DPIDR 0x1ba01477
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for stm32f1x.cpu on 3333
Info : Listening on port 3333 for gdb connections
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08000130 msp: 0x20000f54
Info : device id = 0x20036410
Info : flash size = 128kbytes
auto erase enabled
wrote 78848 bytes from file nuttx.bin in 7.988087s (9.639 KiB/s)

配置openocd通过其它带有ST-Link调试板子来烧写调试,一般的官方评估板都会带有调试器如,NUCLEO的系列,这里是使用NUCLEO-L152RE板上的ST-LINK来烧写调试.首先,把NUCLEO-L152RE板上CN2跳线取掉,根据官方文档(SWD)CN4的端线序是1:VDD,2:SWCLK,3:GND,4:SWDIO,5:NRST,6:SWO,这里只需要三根连接既可,

      SWD         STM32F103C8T6
  PIN2 SWCLK ---->   P14 SWCLK
  PIN3  GND  ---->    GND
  PIN4 SWDIO ---->   P13 SWDIO

~$ openocd -f interface/stlink.cfg -f target/stm32f1x.cfg  -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000" -c "reset run"
Open On-Chip Debugger 0.10.0+dev-01423-g3ffa14b04-dirty (2020-10-14-08:59)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "hla_swd". To override use 'transport select <transport>'.
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
Info : clock speed 1000 kHz
Info : STLINK V2J22M5 (API v2) VID:PID 0483:374B
Info : Target voltage: 3.262028
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for stm32f1x.cpu on 3333
Info : Listening on port 3333 for gdb connections
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08000130 msp: 0x20000d40
Info : device id = 0x20036410
Info : flash size = 128kbytes
auto erase enabled
wrote 109568 bytes from file nuttx.bin in 5.901268s (18.132 KiB/s)

因为STM32F103C8T6的内存只有20K,所以很容易就爆了,最经典的错误就是无法运行内置app,如下所示:

NuttShell (NSH) NuttX-9.1.0
nsh> ?
help usage:  help [-v] [<cmd>]

  .          cmp        exit       kill       mount      rmdir      true
  [          dirname    export     ls         mv         rmmod      uname
  ?          date       false      lsmod      mw         set        umount
  basename   dd         free       mb         printf     sleep      unset
  cat        df         help       mkdir      ps         source     usleep
  cd         echo       hexdump    mksmartfs  pwd        test       xd
  cp         exec       insmod     mh         rm         time

Builtin Apps:
  chat   sh     hello  spi    nsh    cu
nsh> free
             total       used       free    largest
Umem:        17072      15272       1800       1736

nsh> spi
nsh: spi: command not found

原来是因为,内存不够,无法把程序从FLASH复制到内存中运行,所以会出现上面错误,如果打开编译调试错误出,会从nsh shell得到更进一步的错误信息详情,参考.
我这里处理的方法就是把各种线程栈(STACKSIZE)的体积缩小到最大只能是1024,如上所示,内存只有1800,但是编译时设置的spi tool的STACKSIZE是2048.

连接ESP8266

这里是使用USART3,并且是设置:

1	-> System Type -> U[S]ART Configuration -> Serial Driver Configuration -> [*] Disable reordering of ttySx devices.

ESP8266的串口路径是/dev/ttyS1. 连接图如下.

  STM32            ESP8266
PB10 TX3  ------>   RX
PB11 RX3  ------>   TX
   3V3    ------>   CH_PD  # 这里应该使用一个GPIO来驱动它.
   3V3    ------>   3V3
   GND    ------>   GND

nsh> ?
help usage:  help [-v] [<cmd>]

  .          cmp        exit       kill       mount      rmdir      true
  [          dirname    export     ls         mv         rmmod      uname
  ?          date       false      lsmod      mw         set        umount
  basename   dd         free       mb         printf     sleep      unset
  cat        df         help       mkdir      ps         source     usleep
  cd         echo       hexdump    mksmartfs  pwd        test       xd
  cp         exec       insmod     mh         rm         time

Builtin Apps:
  chat  sh    spi   nsh   cu
nsh> cu -s 115200 -l /dev/ttyS1
AT

OK
AT+GMR
AT version:1.7.4.0(May 11 2020 19:13:04)
SDK version:3.0.4(9532ceb)
compile time:May 27 2020 10:12:17
Bin version(Wroom 02):1.7.4
OK

读写SD卡(over SPI)

这里只能使用SPI1做为SD接口,因为在boards/arm/stm32/stm32f103-minimum/src/stm32_mmcsd.c内在已经硬编码如下:

/*****************************************************************************
 * Pre-processor Definitions
 ****************************************************************************/

#ifndef CONFIG_STM32_SPI1
#  error "SD driver requires CONFIG_STM32_SPI1 to be enabled"
#endif

#ifdef CONFIG_DISABLE_MOUNTPOINT
#  error "SD driver requires CONFIG_DISABLE_MOUNTPOINT to be disabled"
#endif

/*****************************************************************************
 * Private Definitions
 ****************************************************************************/

static const int SD_SPI_PORT = 1; /* SD is connected to SPI1 port */
static const int SD_SLOT_NO  = 0; /* There is only one SD slot */

基本的必须配置项如下

CONFIG_STM32_SPI=y
CONFIG_STM32_SPI1=y
CONFIG_SPI=y
CONFIG_SPI_EXCHANGE=y
CONFIG_SPI_DRIVER=y
CONFIG_MMCSD_SPI=y
CONFIG_MMCSD_SPICLOCK=20000000
CONFIG_MMCSD_SPIMODE=0

CONFIG_MMCSD=y
CONFIG_MMCSD_NSLOTS=1
CONFIG_MMCSD_SPI=y
CONFIG_MMCSD_SPICLOCK=20000000
CONFIG_MMCSD_SPIMODE=0
CONFIG_MMCSD_IDMODE_CLOCK=400000
CONFIG_NSH_MMCSDMINOR=0
CONFIG_NSH_MMCSDSLOTNO=0
CONFIG_NSH_MMCSDSPIPORTNO=1

CONFIG_FS_FAT=y
CONFIG_FAT_LCNAMES=y
CONFIG_FAT_LFN=y
CONFIG_FAT_MAXFNAME=32
CONFIG_FAT_LFN_ALIAS_TRAILCHARS=0
CONFIG_FSUTILS_MKFATFS=y

读写W25Q32FV

通过查看代码发现,W25Q32FV也是使用SPI1默认也是硬编码写,这里是默认使用SPI1是的常规必须选项.

CONFIG_STM32_SPI1=y
CONFIG_STM32_SPI=y

CONFIG_MTD_W25=y
CONFIG_W25_SPIMODE=0
CONFIG_W25_SPIFREQUENCY=20000000

CONFIG_MTD=y
CONFIG_MTD_PARTITION=y
CONFIG_MTD_BYTE_WRITE=y
CONFIG_MTD_SMART=y
CONFIG_MTD_SMART_SECTOR_SIZE=1024
CONFIG_MTD_SMART_WEAR_LEVEL=y
CONFIG_MTD_W25=y
CONFIG_FSUTILS_MKSMARTFS=y

按照上面的配置,连接W25QF32V的线路如下:

W25QF32V       STM32F103

  CS    ---->   PA4/NSS
  DO    ---->   PA6/MISO
  DI    ---->   PA7/MOSI
  CLK   ---->   PA5/SCK
  GND   ---->   GND
  VCC   ---->   3V3

格化格,挂载,读写测试.

nsh> mkdir /tmp
nsh> ls /
/:
 dev/
 mnt/
 proc/
nsh> mksmartfs /dev/smart0p1
nsh> mount -t smartfs /dev/smart0p1 /tmp
nsh> echo "11223456" > /tmp/file1.txt
nsh> cat /tmp/file1.txt
11223456

STM32F103C8T6上大部分的接口都描述了Pin的功能,少部分如:PB12,PB13...,这些需要查询stm32f103c8.pdf内的,Chapter 3, Table 5内容.

扩展使用`SPI2`

STM32F103C8T6最小板,引出了两个SPI:
- SPI1: PA4(NSS), PA5(SCK), PA6(MISO), PA7(MOSI)
- SPI2: PB12(NSS), PB13(SCK), PB14(MISO), PB15(MOSI)
但是在Nuttx的里只开启了SPI1,同时只能接一个外设.所以,我要参照文件来修改它,目标是把SPI1连接到SD over SPI,SPI2连接到W25QF32V.

根据FLASH_SPI1_CS添加FLASH_SPI2_CS宏定义,指定PB12,最终文件如下:

~$ cat boards/arm/stm32/stm32f103-minimum/src/stm32f103_minimum.h
[...]
/* SPI chip selects */

#define FLASH_SPI2_CS     (GPIO_OUTPUT|GPIO_CNF_OUTPP|GPIO_MODE_50MHz|\
                           GPIO_OUTPUT_SET|GPIO_PORTB|GPIO_PIN12)
[...]

修改stm32_spi.c,这个文件修改比较多,做成patch文件如:

~$ git diff  boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c  > spi2.patch
diff --git a/boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c b/boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c
index 6f3a585902..01b5b861e8 100644
--- a/boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c
+++ b/boards/arm/stm32/stm32f103-minimum/src/stm32_spi.c
@@ -75,7 +75,7 @@ void stm32_spidev_initialize(void)
    */

 #ifdef CONFIG_MTD_W25
-  stm32_configgpio(FLASH_SPI1_CS);      /* FLASH chip select */
+  stm32_configgpio(FLASH_SPI2_CS);      /* FLASH chip select */
 #endif

 #ifdef CONFIG_CAN_MCP2515
@@ -197,7 +197,7 @@ void stm32_spi1select(FAR struct spi_dev_s *dev, uint32_t devid,
 #endif

 #ifdef CONFIG_MTD_W25
-  stm32_gpiowrite(FLASH_SPI1_CS, !selected);
+  //stm32_gpiowrite(FLASH_SPI1_CS, !selected);
 #endif
 }

@@ -227,6 +227,9 @@ uint8_t stm32_spi1status(FAR struct spi_dev_s *dev, uint32_t devid)
 void stm32_spi2select(FAR struct spi_dev_s *dev, uint32_t devid,
                       bool selected)
 {
+#ifdef CONFIG_MTD_W25
+  stm32_gpiowrite(FLASH_SPI2_CS, !selected);
+#endif
 }

 uint8_t stm32_spi2status(FAR struct spi_dev_s *dev, uint32_t devid)
@@ -294,6 +297,16 @@ int stm32_spi1cmddata(FAR struct spi_dev_s *dev, uint32_t devid,
   return -ENODEV;
 }
 #endif
+
+#ifdef CONFIG_STM32_SPI2
+int stm32_spi2cmddata(FAR struct spi_dev_s *dev, uint32_t devid,
+                      bool cmd)
+{
+    return -ENODEV;
+}
+#endif
+
+
 #endif

 #endif /* CONFIG_STM32_SPI1 || CONFIG_STM32_SPI2 */

修改stm32_w25.c如下:

git diff  boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c
diff --git a/boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c b/boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c
index 6e9d12718d..63ba5153ce 100644
--- a/boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c
+++ b/boards/arm/stm32/stm32f103-minimum/src/stm32_w25.c
@@ -47,7 +47,7 @@
 #include <errno.h>
 #include <debug.h>

-#ifdef CONFIG_STM32_SPI1
+#ifdef CONFIG_STM32_SPI2
 #  include <nuttx/spi/spi.h>
 #  include <nuttx/mtd/mtd.h>
 #  include <nuttx/fs/smart.h>
@@ -67,13 +67,13 @@
  * timer
  */

-#define W25_SPI_PORT 1
+#define W25_SPI_PORT 2

 /* Configuration ************************************************************/
 /* Can't support the W25 device if it SPI1 or W25 support is not enabled */

 #define HAVE_W25  1
-#if !defined(CONFIG_STM32_SPI1) || !defined(CONFIG_MTD_W25)
+#if !defined(CONFIG_STM32_SPI2) || !defined(CONFIG_MTD_W25)
 #  undef HAVE_W25
 #endif

修改stm32_bringup.c如下.

~$ git diff boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c
diff --git a/boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c b/boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c
index efa651034e..b4b0379bde 100644
--- a/boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c
+++ b/boards/arm/stm32/stm32f103-minimum/src/stm32_bringup.c
@@ -143,7 +143,7 @@

 /* Can't support the W25 device if it SPI1 or W25 support is not enabled */

-#if !defined(CONFIG_STM32_SPI1) || !defined(CONFIG_MTD_W25)
+#if !defined(CONFIG_STM32_SPI2) || !defined(CONFIG_MTD_W25)
 #  undef HAVE_W25
 #endif

开启串口调试

如果没有在配置系统里打开相应的DEBUG选项,系统只出错时,只会给出一个简单错误号.下面是打开针对SPI,FS,MMC/SD的出错提示.

CONFIG_DEBUG_ALERT=y
CONFIG_DEBUG_FEATURES=y
CONFIG_DEBUG_ERROR=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_FS_ERROR=y
CONFIG_DEBUG_IRQ=y
CONFIG_DEBUG_IRQ_ERROR=y
CONFIG_DEBUG_MEMCARD=y
CONFIG_DEBUG_MEMCARD_ERROR=y
CONFIG_DEBUG_SPI=y
CONFIG_DEBUG_SPI_ERROR=y
CONFIG_DEBUG_FULLOPT=y
CONFIG_ARCH_HAVE_HARDFAULT_DEBUG=y
CONFIG_ARCH_HAVE_MEMFAULT_DEBUG=y
CONFIG_STM32_DISABLE_IDLE_SLEEP_DURING_DEBUG=y

1 2	nsh> mount -t vfat /dev/mmcsd1 /mnt/sd1 nsh: mount: mount failed: 19

如果碰到类似的错误号,可以打开nuttx/include/errno.h,会看到如下所示:
1
2
#define ENODEV 19
#define ENODEV_STR "No such device"

最终同时开启SPI1,SPI2的接口,包含了SPI,FS,MMC/SD,MTD一些必须选项,配置如下:


CONFIG_STM32_SPI1=y
CONFIG_STM32_SPI2=y
CONFIG_STM32_SPI=y

CONFIG_ARCH_HAVE_SPI_BITORDER=y
CONFIG_SPI=y
CONFIG_SPI_EXCHANGE=y
CONFIG_SPI_CMDDATA=y
CONFIG_SPI_DRIVER=y
CONFIG_MMCSD=y
CONFIG_MMCSD_NSLOTS=1
CONFIG_MMCSD_SPI=y
CONFIG_MMCSD_SPICLOCK=20000000
CONFIG_MMCSD_SPIMODE=0
CONFIG_MMCSD_IDMODE_CLOCK=400000

CONFIG_MTD=y
CONFIG_MTD_PARTITION=y
CONFIG_MTD_BYTE_WRITE=y
CONFIG_MTD_SMART=y
CONFIG_MTD_SMART_SECTOR_SIZE=1024
CONFIG_MTD_SMART_WEAR_LEVEL=y
CONFIG_MTD_W25=y
CONFIG_W25_SPIMODE=0
CONFIG_W25_SPIFREQUENCY=20000000

CONFIG_FS_NEPOLL_DESCRIPTORS=8
CONFIG_FS_MQUEUE_MPATH="/var/mqueue"
CONFIG_FS_FAT=y
CONFIG_FS_SMARTFS=y
CONFIG_SMARTFS_ERASEDSTATE=0xff
CONFIG_SMARTFS_MAXNAMLEN=16
CONFIG_FS_PROCFS=y
CONFIG_FS_PROCFS_REGISTER=y
CONFIG_FS_PROCFS_EXCLUDE_ENVIRON=y

CONFIG_FSUTILS_MKFATFS=y
CONFIG_FSUTILS_MKSMARTFS=y
CONFIG_NSH_MMCSDMINOR=0
CONFIG_NSH_MMCSDSLOTNO=0
CONFIG_NSH_MMCSDSPIPORTNO=1
CONFIG_NSH_CODECS_BUFSIZE=128

测试系统如下.

NuttShell (NSH) NuttX-9.1.0
nsh> ls /dev
/dev:
 console
 mmcsd0
 null
 smart0p0
 smart0p1
 smart0p2
 smart0p3
 ttyS0
nsh> free
             total       used       free    largest
Umem:        17536      12872       4664       4664
nsh> mkdir /mnt /p0
nsh: mkdir: too many arguments
nsh> mkdir /mnt
nsh> mkdir /p0
nsh> mount -t vfat /dev/mmcsd0 /mnt
nsh> mount -t smartfs /dev/smart0p0 /p0
nsh> df
  Block  Number
  Size   Blocks     Used Available Mounted on
 16384    15611        3     15608 /mnt
  1024       64       11        53 /p0
     0        0        0         0 /proc
nsh> free
             total       used       free    largest
Umem:        17536      14888       2648       2584
nsh> ls /p0
/p0:
 file.txt
nsh> cat /p0/file.txt
test
nsh> free
             total       used       free    largest
Umem:        17536      14888       2648       2584
nsh> ls /mnt
/mnt:
 file1.txt
nsh> cat /mnt/file1.txt
1112222ssss

NUCLEO-L152RE (MB1136 c-03)

ST-Nucleo-L152RE
TFTLCD原理与驱动与指令介绍
Description
how-to-teach-endian
FSMC-TFT LCD调试记录
野火LCD—液晶显示
LCD有如下控制线:
- CS:Chip Select 片选,低电平有效
- RS:Register Select 寄存器选择
- WR:Write 写信号,低电平有效
- RD:Read 读信号,低电平有效
- RESET:重启信号,低电平有效
- DB0-DB15:数据线
RS为1(表示DB0-15上传递的是要被写到寄存器的值),如果为0,表示传递的是数据.
写(WR=0,RD=1),读(WR=1,RD=0)

RESET一直为高,如果RESET为低,会导致芯片重启

1	~$ openocd -f board/stm32ldiscovery.cfg -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000" -c "reset run"

操作IO口,参考STM32F10xxx参考手册,第8章,8.2.5 GPIOx_BSRR的寄存器内容.

提交代码给`NuttX`

Github中文文档
Making Changes Using Git
NuttX代码是在Github上,所以要为它提交代码,需要先有Github帐号.
在自己的Github里forkNuttX.再在本地克隆刚才fork的NuttX.
并且把原https://github.com/apache/incubator-nuttx.git设置成上游(upstream).

1 2	~$ git clone <your forked incubator-nuttx project clone url> ~$ git remote add upstream https://github.com/apache/incubator-nuttx.git

创建一个本地的开发分支,并且把它pull自己fork项目中去.

~$ git checkout -b dev/stm32l152re-ili93418b-driver
Switched to a new branch 'dev/stm32l152re-ili93418b-driver'

~$ git push
fatal: The current branch dev/stm32l152re-ili93418b-driver has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin dev/stm32l152re-ili93418b-driver

按照提示,需要关联到远程分支,把本地分支dev/stm32l152re-ili93418b-driver推送到origin远程分支上.

~$ git push --set-upstream origin dev/stm32l152re-ili93418b-driver
Username for 'https://github.com': xxxxxx
Password for 'https://xxxxxxx@github.com':
Total 0 (delta 0), reused 0 (delta 0)
remote:
remote: Create a pull request for 'dev/stm32l152re-ili93418b-driver' on GitHub by visiting:
remote:      https://github.com/xxxxxx/incubator-nuttx/pull/new/dev/stm32l152re-ili93418b-driver
remote:
To https://github.com/xxxxx/incubator-nuttx
 * [new branch]            dev/stm32l152re-ili93418b-driver -> dev/stm32l152re-ili93418b-driver
Branch 'dev/stm32l152re-ili93418b-driver' set up to track remote branch 'dev/stm32l152re-ili93418b-driver' from 'origin'.

关联到远程分支后,以后就可以直接push,查看.git/config如下

~$ cat .git/config
[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true
[remote "origin"]
	url = https://github.com/xxxxx/incubator-nuttx
	fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
	remote = origin
	merge = refs/heads/master
[remote "upstream"]
	url = https://github.com/apache/incubator-nuttx.git
	fetch = +refs/heads/*:refs/remotes/upstream/*
[branch "dev/stm32l152re-ili93418b-driver"]
	remote = origin
	merge = refs/heads/dev/stm32l152re-ili93418b-driver

读取上游的更新并且合并到本地的master分支.再push到自己的origin的创库.

~$ git checkout master  # 本地切换到master分支
~$ git fetch upstream  # 读取上游更改,同步.
~$ git merge upstream/master  # 合并上游到本地
~$ git push             # push同步,把本的master同步远端的origin/master

创建新的更改或文件并且push到远程分支

1
2
3

~$ git add new-file.c
~$ git commit new-file.c
~$ git push

此时github的分支上面,会有一个提示,Create Pull Request,把当前的分支推送到upstream上去,提交合并请求.
如果运行rebase upstream/master后,发现代码有问题,可以使用git reset --hard HEAD~1回退到指定的节点,~[num]回退第几个结点.这里可以使用gitg图形工具可以直观的显示.假如这里是回退到创建分支时第一次commit之前, 在些可以重新修改或添加文件,再重新commit, 并且要使用git push --force才行.之后再可以git fetch upstream; git rebase upstream/master.

合并多个`commit`为一个完整`commit`

为了保持提交记录的更简洁明了,需要把多个commit合并到一个完整的commit,然后再push到上游库中.命令的方式是:git rebase -i [startpoint] [endpoint].如果不指定endpoint,则该区间的终点默认是当前分支HEAD所指向的commit.

~$ git rebase -i HEAD~3
pick a68185edc0f2cbd38c8fdbcffaf516278f4f Fix merge conflicts
pick efe10a20278f53af9e6fff5754de39b8c8c4 net/icmp: add sanity check to avoid wild data length
pick fb7480c67637bfa2164f4f76ceff6f509d24 net/neighbor/neighbor_ethernet_out.c: fix build error without ICMPv6
pick 0a262336bd9964b693b57fe93d992482d5d3 arch/arm/src/stm32/stm32_otghsdev.c: Fix syslog formats
[....]
# Rebase dd4b5e0c68..18d489a8dd onto dd4b5e0c68 (32 commands)
#
# Commands:
# p, pick <commit> = use commit  /* 保留该commit */
# r, reword <commit> = use commit, but edit the commit message /* 保留该commit,但我需要修改该commit的注释 */
# e, edit <commit> = use commit, but stop for amending /* 保留该commit, 但我要停下来修改该提交(不仅仅修改注释) */
# s, squash <commit> = use commit, but meld into previous commit /* 将该commit和前一个commit合并 */
# f, fixup <commit> = like "squash", but discard this commit's log message /* 如sqaush,但是不保留该提交的注释信息 */
# x, exec <command> = run command (the rest of the line) using shell /* 执行shell命令 */
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit /* 丢弃该commit */
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# .       create a merge commit using the original merge commit's
# .       message (or the oneline, if no original merge commit was
# .       specified). Use -c <commit> to reword the commit message.
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

如上面的说明一样,支持多种编辑,现在假如改成如下所示.修改完成后保存,就会转到注释的修改界面,再保存.它就执行了rebase,再用git push -f把相应的修改强制提交上去.

pick c029a68185edc0f2cbd38c8fdbcffaf516278f4f Fix merge conflicts
s efe10a20278f53af9e6fff5754de39b8c8c4 net/icmp: add sanity check to avoid wild data length
s fb7480c67637bfa2164f4f76ceff6f509d24 net/neighbor/neighbor_ethernet_out.c: fix build error without ICMPv6
f 0a262336bd9964b693b57fe93d992482d5d3 arch/arm/src/stm32/stm32_otghsdev.c: Fix syslog formats

它其实是在.git下面的文件.

~$ head  .git/rebase-merge/git-rebase-todo.backup
pick a68185edc0f2cbd38c8fdbcffaf516278f4f Fix merge conflicts
pick efe10a20278f53af9e6fff5754de39b8c8c4 net/icmp: add sanity check to avoid wild data length
pick fb7480c67637bfa2164f4f76ceff6f509d24 net/neighbor/neighbor_ethernet_out.c: fix build error without ICMPv6
pick 0a262336bd9964b693b57fe93d992482d5d3 arch/arm/src/stm32/stm32_otghsdev.c: Fix syslog formats
[....]

分支合并到主线时,可以删除该分支.

1 2	~$ git branch -d <本地分支> ~$ git push origin --delete <远端分支>

如果因为rebase时所产生的CONFLICT (content): Merge conflict in src/xxxx.cpp,并且确定冲突可以以一方为标准处理,可以使用下面命令自动处理.--theirs就是使用pull上游仓库版本,--ours是使用本地版本,再把它commit就算是合并成功了.
1
2
~$ git checkout --theirs src/xxxx.cpp
~$ git commit

Git其它应用

使用git把指定的一个commit导出成patch.how can i generate a git patch for a specific commit

1	~$ git format-patch -1 HEAD

在自动处理合并分支的冲突时,如果只有一边是修改的情况下,参考这里可以使用如下:

1	~$ git merge --strategy-option theirs

`git`提交时自动修改(自增)版本号的文件

~$ cat .git/hooks/pre-commit
#!/bin/bash

export VER_FILE=src/utils/utils.pri

if [ ! -e ${VER_FILE} ]; then
   exit 1;
fi

echo "start to update the version in $VER_FILE"
# 211.8.73
export VERSION=$(grep "^VERSION =" ${VER_FILE} | awk '{print $3}' | tr -d '\r')
# VARR(211 8 73)
IFS='.' read -r -a VARR <<< "$VERSION"
# VARR(211 8 74)
VARR[2]=$((VARR[2]+1))
# 211. 8 .74
VERSION=$(echo ${VARR[@]} | fold -w3 | paste -sd.)
# replace 211.8.73 with 211.8.74
sed -i "/^VERSION/s/=.*/= ${VERSION// /}/" ${VER_FILE}

`git`使用图形工具(meld)对比代码

在.git/config添加下面内容

# Add the following to your .gitconfig file.
[diff]
    tool = meld
[difftool]
    prompt = false
[difftool "meld"]
    cmd = meld "$LOCAL" "$REMOTE"

对比不同分支上的同一个文件

1	~$ git difftool mybranch master -- target.file

对比当前分支与other-branch的src目录下的文件,--后的参数可以省掉,也可以具体指某一个文件或者目录名.
1
~$ git difftool other-branch -- src
当前分支的文件与其它的commit对比.
1
~$ git difftool HEAD~2 -- src/file.txt
合并other-branch到当前分支,并使用other-branch来解决冲突(theirs).
1
~$ git merge -X theirs other-branch
回退一个合并
1
git reset --merge HEAD~1

强制fetch远程仓库,覆盖本地仓库,替代pull时冲突提示(慎用)

# fetch from the default remote, origin
~$ git fetch
# reset your current branch (master) to origin's master
~$ git reset --hard origin/master

STM32F4-Discovery(MB997C)

Audio(CS43L22)支持

SPI SD卡支持

USB-OTG

BLE Sniffer

Ethernet 8720A

Arm Mbed-OS

参考链接

配置开发环境

这里使用的是当前最新的v6.3版本.Mbed支持三种不同类型的开发环境(桌面IDE,在线IDE,命令行),且支持多系统平台(Win,Mac,Linux),这是使用Keil,IAR所不能比拟的.这里会根据官方文档说明,实践一下桌面与命令行的开发.而且它使用的是C++语言开发,这样有别于传统的Keil,IAR的C语开发,它的代码风格与Arduino一样.并且它内置的实时操作系统就是FreeRTOS.

Mbed Studio

使用Mbed Studio需要注册一个帐号,安装完成第一次打开IDE,会需要先登录.它与Web Stduio还有同一个优点,可以从线上导入官方的模版工程.

~$ wget -c https://studio.mbed.com/installers/latest/linux/MbedStudio.sh
~$ ./MbedStudio.sh
~$ du -sh ~/.local/bin/mbed-studio
~$ ls ~/.config/Mbed Studio
 api-targets.json   Cache         Cookies           GPUCache        library-pipeline   mbed-studio.log    'Network Persistent State'
 blob_storage       config.json   Cookies-journal   library-cache  'Local Storage'     mbed-studio-tools   recentworkspace.json

Mbed Studio原本是使用Arm Compiler 6,这里可以也可以切换到Arm Embedded GCC Compiler

~$ cat > ~/.config/Mbed Studio/external-tools.json <<EOF
> {
>    "bundled": {
>       "gcc": "/fullpath/gcc-arm-none-eabi-9-2020-q2-update/bin"
>     },
>     "defaultToolchain": "GCC_ARM"
> }
> EOF

如上面所示,Mbed Studio也是如同VScode这样的IDE,也是使用json格式配置文件,
打开界面菜单:File -> Settings -> Open Perferences.

测试FRDM-KL25Z

更新`OpenSDA`的固件

Mbed Studio需要最新的固件来支持调试FRDM-KL25Z,至少需要mbed_if_v2.0_frdm_kl25z.s19才能支持CMSIS-DAP,所以需按照上述链接下载到固件(Pemicro_OpenSDA_Debug_MSD_Update_Apps_2020_05_12.zip).解压后,文件夹有*.SDA后缀的固件文件,还有一些Notes与指导手册等.电脑连到KL25Z的SDA接口时,会在系统内看到一个FRDM-KL25Z的盘符.
这里还有一个问题,升级固件时,必须让板子进入Bootloder模式,此时它会挂载盘符叫BOOTLODER.发现只能Windows的系统下进行升级,按住板子上的RST键,接入SDA上电,因为板子内的固件是v1.01,在Linux下无法挂载出一个USB盘符, 而只能在windows XP,Win 7下是可以挂载的且可以升级成功.这里原因没有深究它了.
解压固件后的目录内还有一个OpenSDA_Bootloader_Update_App_v111_2013_12_11.zip,再解压出就是BOOTUPDATEAPP_Pemicro_v111.SDA要升级的固件文件,这里把MSD-DEBUG-FRDM-KL25Z_Pemicro_v118.SDA,BOOTUPDATEAPP_Pemicro_v111.SDA,20140530_k20dx128_kl25z_if_opensda.s19三个文件直接复制进BOOTLOADER盘符内,就可以升级了.升级完成后,接入SDA后在Linux下会自动挂载一个MBED盘.
再次按住板子上的RST键,接入SDA上电,进入BOOTLOADER模式,打开BOOTLOADER盘内的SDA_INFO.HTM跳转到网页,查看网页的信息是否与升级的固件版本对应上.并且固件在v1.11进,也就可以自动在Linux下挂载BOOTLOADER盘了,可以方便的进行后续版本的升级了.
新建工程,导入官方示例mbed-os-example-blinky,界面如下:

更新后,可以使用openocd连接调试.

~$ openocd -c "adapter driver cmsis-dap" -f board/frdm-kl25z.cfg
Open On-Chip Debugger 0.10.0+dev-01423-g3ffa14b04-dirty (2020-10-14-08:59)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Warn : Interface already configured, ignoring
Info : auto-selecting first available session transport "swd". To override use 'transport select <transport>'.
Info : add flash_bank kinetis kl25.pflash
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : CMSIS-DAP: SWD  Supported
Info : CMSIS-DAP: FW Version = 1.0
Info : CMSIS-DAP: Interface Initialised (SWD)
Info : SWCLK/TCK = 0 SWDIO/TMS = 1 TDI = 0 TDO = 0 nTRST = 0 nRESET = 1
Info : CMSIS-DAP: Interface ready
Info : clock speed 1000 kHz
Info : SWD DPIDR 0x0bc11477
Info : SWD DPIDR 0x0bc11477
Error: Failed to write memory at 0xe000edf0
Info : kl25.cpu: external reset detected
Warn : **** Your Kinetis MCU is probably locked-up in RESET/WDOG loop. ****
Warn : **** Common reason is a blank flash (at least a reset vector).  ****
Warn : **** Issue 'kinetis mdm halt' command or if SRST is connected   ****
Warn : **** and configured, use 'reset halt'                           ****
Warn : **** If MCU cannot be halted, it is likely secured and running  ****
Warn : **** in RESET/WDOG loop. Issue 'kinetis mdm mass_erase'         ****
Info : starting gdb server for kl25.cpu on 3333
Info : Listening on port 3333 for gdb connections

Mbed CLI

这里是在Linux下面的安装需求,需要系统先安装了Git,Python3.7.x,Mercurial等环境.
1
2
~$ sudo apt-get install python3 python3-pip git mercurial -y
~$ pip install mbed-cli

安装配置交叉工具链

根据这里提示,支持三种厂商工具:Arm Compiler 6.13 Professional., Keil MDK 5.29,[GNU Arm Embedded version 9 (9-2019-q4-major)](GNU Arm Embedded version 9 (9-2019-q4-major).这里根据上述中的链接去下载安装通用的GCC ARM工具链.

~$ mbed config -G ARM_GCC_PATH /fullpath/gcc-arm-none-eabi-9-2020-q2-update/bin
[mbed] fullpath/gcc-arm-none-eabi-9-2020-q2-update/bin now set as global ARM_GCC
~$ mbed config --list
[mbed] Global config:
GCC_ARM_PATH=/fullpath/gcc-arm-none-eabi-9-2020-q2-update/bin
ARMC6_PATH=/fullpath/ARMCompiler6.15/bin

[mbed] Local config (/home/michael):
Couldn't find valid mbed program in /home/michael

新建工程

~$ mbed new mbed-example-program
[mbed] Working path "/fullpath/Mbed Programs" (directory)
[mbed] Creating new program "mbed-example-program" (git)
[mbed] Adding library "mbed-os" from "https://github.com/ARMmbed/mbed-os" at branch/tag "latest"
[mbed] Updating reference "mbed-os" -> "https://github.com/ARMmbed/mbed-os/#0db72d0cf26539016efbe38f80d6f2cb7a3d4414"
[mbed] Auto-installing missing Python modules (mbed_cloud_sdk, mbed_ls, mbed_host_tests, mbed_greentea, manifest_tool, icetea, pycryptodome, cryptography)...
~$ tree -L 1 mbed-example-program/
mbed-example-program/
├── mbed_app.json
├── mbed-os
├── mbed-os.lib
└── mbed_settings.py

1 directory, 3 files
~$ mbed ls -a
[mbed] Working path "/fullpath/Mbed Programs/mbed-example-program" (program)
mbed-example-program (mbed-example-program)
`- mbed-os (https://github.com/ARMmbed/mbed-os#0db72d0cf265)

上面新建的工程,默认添加了mbed-os的支持,也可以使用--create-only选项创建不含系统的工程.

~$ mbed new project2 --create-only
[mbed] Working path "/fullpath/Mbed Programs" (directory)
[mbed] Creating new program "project2" (git)
~$ tree -L 1 project2/
project2/
└── mbed_settings.py

导入工程

~$ mbed import https://github.com/ARMmbed/mbed-os-example-blinky#mbed-os-5.15.0  my-blink
[mbed] Working path "/fullpath/github/ArmMbed" (directory)
[mbed] Importing program "my-blink" from "https://github.com/ARMmbed/mbed-os-example-blinky" at branch/tag "mbed-os-5.15.0"
[mbed] Adding library "mbed-os" from "https://github.com/ARMmbed/mbed-os" at rev #64853b354fa1

为工程添加库

~$ cd my-blink
~$ mbed add https://github.com/ARMmbed/mbed-cloud-client
[mbed] Working path "/fullpath/Mbed Programs/my-blink" (program)
[mbed] Adding library "mbed-cloud-client" from "https://github.com/ARMmbed/mbed-cloud-client" at latest revision in the current branch
[mbed] Updating reference "mbed-cloud-client" -> "https://github.com/ARMmbed/mbed-cloud-client/#f72a23e0dc21de4c82ee53fe947153341419a5b9"

如果要删除库就直接:mbed remove mbed-cloud-client

编译工程

查看可以支持的板子.

~$ mbed compile --supported  # mbed compile -S
[mbed] Working path "/fullpath/ArmMbed/mbed-os-example-blinky" (program)
| Target        | mbed OS 2 | mbed OS 5 | ARM       | uARM | GCC_ARM   | IAR       |
| ------------- | --------- | --------- | --------- | ---- | --------- | --------- |
| ADV_WISE_1510 | -         | Supported | Supported | -    | Supported | Supported |
| ADV_WISE_1570 | -         | Supported | Supported | -    | Supported | Supported |
| ARCH_MAX      | -         | Supported | Supported | -    | Supported | Supported |
| ARCH_PRO      | -         | Supported | Supported | -    | Supported | Supported |
[......]

~$

编译

~$ mbed compile -m KL25Z -t GCC_ARM
[mbed] Working path "/fullpath/ArmMbed/my-blink" (program)
Building project my-blink (KL25Z, GCC_ARM)
Scan: my-blink
Compile [  0.4%]: at24mac.cpp
[...]
Link: my-blink
Elf2Bin: my-blink
| Module           | .text         | .data       | .bss        |
| ---------------- | ------------- | ----------- | ----------- |
| [fill]           | 48(+48)       | 0(+0)       | 28(+28)     |
| [lib]/c.a        | 4828(+4828)   | 2108(+2108) | 89(+89)     |
| [lib]/gcc.a      | 1004(+1004)   | 0(+0)       | 0(+0)       |
| [lib]/misc       | 200(+200)     | 4(+4)       | 28(+28)     |
| main.o           | 84(+84)       | 0(+0)       | 0(+0)       |
| mbed-os/drivers  | 92(+92)       | 0(+0)       | 0(+0)       |
| mbed-os/hal      | 1440(+1440)   | 4(+4)       | 67(+67)     |
| mbed-os/platform | 4204(+4204)   | 264(+264)   | 220(+220)   |
| mbed-os/rtos     | 6468(+6468)   | 168(+168)   | 5973(+5973) |
| mbed-os/targets  | 2424(+2424)   | 4(+4)       | 19(+19)     |
| Subtotals        | 20792(+20792) | 2552(+2552) | 6424(+6424) |
Total Static RAM memory (data + bss): 8976(+8976) bytes
Total Flash memory (text + data): 23344(+23344) bytes

Image: ./BUILD/KL25Z/GCC_ARM/my-blink.bin

设置默认的target与交叉工具链

~$ mbed target KL25Z
[mbed] Working path "/fullpath/Mbed Programs/my-blink" (program)
[mbed] KL25Z now set as default target in program "my-blink"
~$ mbed toolchain GCC_ARM
[mbed] Working path "/fullpath/Mbed Programs/my-blink" (program)
[mbed] GCC_ARM now set as default toolchain in program "my-blink"

测试与调试

运行代码测试
1
2
~$ mbed test -m KL25Z -t GCC_ARM
[...]

查看测试列表

~$ mbed test --compile-list  | head
Test Case:
    Name: mbed-os-features-device_key-tests-device_key-functionality
    Path: ./mbed-os/features/device_key/TESTS/device_key/functionality
Test Case:
    Name: mbed-os-features-frameworks-utest-tests-unit_tests-basic_test
    Path: ./mbed-os/features/frameworks/utest/TESTS/unit_tests/basic_test
Test Case:
[....]

连接板子运行测试

~$ mbed test -m KL25Z -t GCC_ARM --run
[mbed] Working path "/home/michael/3TB-DISK/Mbed Programs/my-blink" (program)
mbedgt: greentea test automation tool ver. 1.7.4
mbedgt: test specification file './BUILD/tests/KL25Z/GCC_ARM/test_spec.json' (specified with --test-spec option)
mbedgt: using './BUILD/tests/KL25Z/GCC_ARM/test_spec.json' from current directory!
mbedgt: detecting connected mbed-enabled devices...
mbedgt: detected 1 device
mbedgt: processing target 'KL25Z' toolchain 'GCC_ARM' compatible platforms... (note: switch set to --parallel 1)
mbedgt: running 4 tests for platform 'KL25Z' and toolchain 'GCC_ARM'
mbedgt: mbed-host-test-runner: started
mbedgt: retry mbedhtrun 1/1
mbedgt: ['mbedhtrun', '-m', 'KL25Z', '-p', '/dev/ttyACM0:9600', '-f', '"BUILD/tests/KL25Z/GCC_ARM/mbed-os/TESTS/psa/spm_smoke/spm_smoke.bin"', '-e', '"mbed-os/TESTS/host_tests"', '-d', '/media/michael/MBED', '-c', 'default', '-t', '02000201242BD1925E8A1EE0', '-r', 'default', '-C', '4', '--sync', '5', '-P', '60'] failed after 1 count
[...]

`Ardunio`库

Links:
- stm32duino
- wiki
- Nucleo-F767ZI与Nucleo-L152RE都是兼容Arduino管脚的,这里是使用Nucleo-L152RE为目标对像.

添加`STM32 Cores`

打开Arduino IDE的File->Preferences,在Additional Board Manager URLs:内,加入这个链接https://github.com/stm32duino/BoardManagerFiles/raw/master/STM32/package_stm_index.json
添加完成,它会更新数据.进入Tools -> Boards -> Boards Manager过滤”stm32”或者下拉,选择安装STM32 Cores.

烧写方式

下载stm32-programmers,解压后,直接运行Linux下的安装程序,根据向导提示,安装在当前用户的目录下.安装完后,目录如下:

~/STMicroelectronics/STM32Cube/STM32CubeProgrammer/bin$ ls
ExternalLoader  HSM           libssl.so        libstp11_SAM.so.conf  STM32CubeProgrammer          STM32MP_KeyGen_CLI       STM32_Programmer_CLI
FlashLoader     libcrypto.so  libstp11_SAM.so  RSSe                  STM32CubeProgrammerLauncher  STM32MP_SigningTool_CLI  STM32_Programmer.sh

这里是使用Nucleo-L152RE为目标板,选择Tools -> Boards: <any> -> STM32 Boards (select from submenu -> Nucleo-64.
Upload方法,选择Tools -> Upload method -> STM32CubeProgrammer (SWD)

更新ST-Link的固件

JTAG and SWD Guide

烧写时提示如下的错误.

 STM32CubeProgrammer v2.4.0
      -------------------------------------------------------------------

Error: Old ST-LINK firmware version. Upgrade ST-LINK firmware
Error: Old ST-LINK firmware version. Upgrade ST-LINK firmware
Error: Old ST-LINK firmware!Please upgrade it.
Error: Old ST-LINK firmware!Please upgrade it.

下载stsw-link007,后解压它.

解压目录如下:

~$ tree -L 2  stsw-link007
stsw-link007
├── AllPlatforms
│   ├── native
│   ├── StlinkRulesFilesForLinux
│   └── STLinkUpgrade.jar
├── readme.txt
└── Windows
    ├── ST-LinkUpgrade.exe
    └── STLinkUSBDriver.dll

根据它的readme.txt提示,安装完成 StlinkRulesFilesForLinux后下面,就可以运行更新GUI程序更新固件了.
1
~$ java -jar STLinkUpgrade.jar
更新完最新的固件后可以,用ST官方的STM32CubeProgrammer读写,但是想用它的ST-link外接给STM32F103最小系统板做SWD烧写调试时出错了.

开源`Stlink-tools`

Plug in Nucleo-144 STM32F767ZI
在Linux下有开源stlink,也可以直接运行apt-get install stlink-tools安装发行版.

~$ st-info --probe
Found 1 stlink programmers
 serial:     30363641464634383535353037353531383731xxxxxx
 hla-serial: "\x30\x36\x36\x41\x46\x46\x34\x38\x35\x35\x35\x30\x37\x35\x35\x31\x38\x37\x31\x38\x32\x37\x34\x37"
 flash:      2097152 (pagesize: 2048)
 sram:       524288
 chipid:     0x0451
 descr:      F76xxx

stlink-gui

1	~$ apt-get install stlink-gui

连接成功后读取内存如下。

`SPI SD`示例

这里选择标准库内的File -> Examples -> Examples for any board -> SD -> listfiles的示例,程序如下所示,管脚定义兼容Nucleo-L152RE,只是CS这里与原程序定义不符.Nucleo-L152RE里是pin 10 (CS),所以只需要在程序内改成!SD.begin(10).

The circuit:
  SD card attached to SPI bus as follows:
** MOSI - pin 11
** MISO - pin 12
** CLK - pin 13
** CS - pin 4 (for MKRZero SD: SDCARD_SS_PIN)

[...]
if (!SD.begin(10)) {
   Serial.println("initialization failed!");
   while (1);
 }
[...]

一键upload,如果编译烧写成功后,使用串口查看它的输出.

树莓派相关

FreeRTOS

协议分析工具

FTDI232H

PyFTDI

PyFTDI doc
FTDI232H BoB P. II: E-Z SPI & I2C

显示所有设备

from pyftdi.ftdi import Ftdi
Ftdi.show_devices()
Available interfaces:
  ftdi://ftdi:232h:6:49/1   (Single RS232-HS)

I2C

i2c通信, SCL(AD0),SDA(AD1,AD2). I2C的地址是一个7-bit的数值,加上第8-bit方向位(0:写,1:读),构成一个8-bit数值.


from pyftdi.i2c import I2cController
# Instantiate an I2C controller
i2c = I2cController()

# Configure the first interface (IF/1) of the FTDI device as an I2C master
i2c.configure('ftdi://ftdi:2232h/1')

# Get a port to an I2C slave device
slave = i2c.get_port(0x21)

# Send one byte, then receive one byte
slave.exchange([0x04], 1)

# Write a register to the I2C slave
slave.write_to(0x06, b'\x00')

# Read a register from the I2C slave
slave.read_from(0x00, 1)

SPI

下面是通过使用UM232H来读取一颗裸SPI NOR Flash W25Qxx的厂商编号(jedec_id).接线如下:

UM232H           W25Q64FV
  AD0   <----->   CLK   pin6
  AD1   <----->   DI    pin5
  AD2   <----->   DO    pin2
  AD3   <----->   CS    pin1
  GND   <----->   GND   pin4
  VCC   <----->   VCC   pin8
  VCC   <----->   /HOLD pin7
  VCC   <----->   /WP   pin3

简单测试读取jedec_id

import usb
import usb.util
from pyftdi.spi import SpiController
dev = usb.core.find(idVendor=0x0403, idProduct=0x6014)

spi =  SpiController()
spi.configure(dev)

# Get a port to a SPI slave w/ /CS on A*BUS3 and SPI mode 0 @ 12MHz
slave = spi.get_port(cs=0,freq=12E6,mode=0)

jedec_id = slave.exchange([0x9f],3)

print(hex(jedec_id[1]<< 8 | jedec_id[2]))

使用pyspiflash测试SPI flash读写.

pyspiflash/spiflash/tests$ ./serialflash.py
Using FTDI device ftdi://ftdi:232h:1:67/1
Flash device: Winbond W25Q64 8 MiB @ SPI freq 12.0 MHz
.Read 8192 KiB in 7 seconds @ 1152 KiB/s
..Erase 1024 KiB from flash @ 0x700000 (may take a while...)
Erased 1024 KiB in 17 seconds @ 59 KiB/s
Build test sequence
Writing 1024 KiB to flash (may take a while...)
Wrote 1024 KiB in 14 seconds @ 70 KiB/s
Reading 1024 KiB from flash
Read 1024 KiB in 915 ms @ 1118 KiB/s
Verify flash
Reference: 4942ea371ad576065759f232f429a8abf10c755a
Retrieved: 4942ea371ad576065759f232f429a8abf10c755a
...
----------------------------------------------------------------------
Ran 6 tests in 42.775s

OK

FlashROM读取

FT2232SPI_Programmer
Unpacking the binary firmware /w Binwalk
Zyxel firmware extraction and password analysis
flashrom

探测flash的类型。

~$ flashrom -L

~$ flashrom -p ft2232_spi:type=2232H,port=A
flashrom v1.2 on Linux 5.16.13-20220310 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Found Macronix flash chip "MX25L6405" (8192 kB, SPI) on ft2232_spi.
Found Macronix flash chip "MX25L6405D" (8192 kB, SPI) on ft2232_spi.
Found Macronix flash chip "MX25L6406E/MX25L6408E" (8192 kB, SPI) on ft2232_spi.
Found Macronix flash chip "MX25L6436E/MX25L6445E/MX25L6465E/MX25L6473E/MX25L6473F" (8192 kB, SPI) on ft2232_spi.
Multiple flash chip definitions match the detected chip(s): "MX25L6405", "MX25L6405D", "MX25L6406E/MX25L6408E", "MX25L6436E/MX25L6445E/MX25L6465E/MX25L6473E/MX25L6473F"
Please specify which chip definition to use with the -c <chipname> option.

读取flash内容。

~$ flashrom -p ft2232_spi:type=2232H,port=A -r test-mx25l6445e.rom -c "MX25L6436E/MX25L6445E/MX25L6465E/MX25L6473E/MX25L6473F"
flashrom v1.2 on Linux 5.16.13-20220310 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Found Macronix flash chip "MX25L6436E/MX25L6445E/MX25L6465E/MX25L6473E/MX25L6473F" (8192 kB, SPI) on ft2232_spi.
Reading flash... done.

屏幕驱动

OpenOCD

让OpenOCD支持FTDI的MPSSE功能,--enable-ftdi Enable building support for the MPSSE mode of FTDI

1 2	~$ cd openocd ~$ ./configure --enable-sysfsgpio --enable-buspirate --enable-ftdi

除了需要开启openocd的支持,还有就是接线,这里试用了interface/ftdi/ft232h-module-swd.cfg,interface/ftdi/minimodule-swd.cfg两个配置文件都可以连接成功.配置文件内有接线注释,参考上面的一些链接好像是说要在ADBUS1,ADBUS2中间接一个470 ohm的电阻, 但是这里没有接.只是要确认配置文件内的vid_pid与所连接的硬件匹配.

使用Jlink-ob或者其它板上的STLink只需要三根线,但是这里必需要接nTRST,如:stm32f103zet6就是PB4.

# FT232HQ minimodule channel 0 (Channel A)
# Connector  FTDI              Target
# Pin        Name
# ---------  ------            ------
# CN2-10      GND               GND
# CN2-13     ADBUS0 (TCK)      SWCLK
# CN2-14     ADBUS2 (TDI/TDO)  SWDIO
# CN2-15     ADBUS1 (TDO/TDI)  SWDIO
# CN2-17     ADBUS4 (GPIOL0)   nTRST

1 2	~$ openocd -f interface/ftdi/ft232h-module-swd.cfg -f target/stm32f1x.cfg -c init \ -c "reset halt" -c "flash write_image erase nuttx.bin 0x08000000"

树莓派(RassberyPi)

Saleae 逻辑分析仪

Saleae
Unofficially Supported Protocols
Saleae逻辑分析仪除了它官方内置支持的一些协议,还可以使用一些第三方Unofficially Supported Protocols协议扩展.

Saleae AnalzerSDK

SDIO协议插件

SaleaeSDIOAnalyzer
AnalyzerSDK与SaleaeSDIOAnalyzer两个目录必需要在同一级目录内,再进入到SaleaeSDIOAnalyzer内,直接cmake . && make.如果无错,就会生成libSDIOAnalyzer.so.
Configure Logic to look for the Analyzer Plugin
Launch Logic manually
Options -> Preferences
Under [For Developers], “Search this path for Analyzer Plugins”
Browse for the ../sdmmc-analyzer/xcode4/build/Debug directory
Click “Save” and close Logic

SDMMC协议

SD/MMC Analyzer for Logic
这个源码是用python脚本去编译的,这里在它的目录建一个CMake脚本来处理编译.成功后会看到一个libSDMMCAnalyzer.so.

sdmmc-analyzer$ cat CMakeLists.txt
project("Saleae SDMMC Analyzer")
cmake_minimum_required(VERSION 3.0)

message(WARNING "CMake support is still experimental!")

# Find Analyzer include dir
find_path(
    ANALYZER_SDK_INCLUDE_DIR
    NAMES
    Analyzer.h
    AnalyzerChannelData.h
    AnalyzerHelpers.h
    AnalyzerResults.h
    AnalyzerSettings.h
    AnalyzerTypes.h
    SimulationChannelDescriptor.h
    PATHS
    ../include/
    ../AnalyzerSDK/include
    DOC
    "Include directory of the analyzer SDK."
)

if(NOT ANALYZER_SDK_INCLUDE_DIR)
    message(SEND_ERROR "Analyzer SDK include directory not found")
else()
    message(STATUS
        "Analyzer SDK include directory found at ${ANALYZER_SDK_INCLUDE_DIR}")
endif()

# needed to differ between 32 and 64 bit library
set(ANALYZER_BITNESS)
set(ANALYZER_LIB_NAME "")

if(CMAKE_SIZEOF_VOID_P EQUAL 4)
    message(STATUS "32 Bit detected")
    set(ANALYZER_BITNESS 32)
    set(ANALYZER_LIB_NAME "Analyzer")
elseif(CMAKE_SIZEOF_VOID_P EQUAL 8)
    message(STATUS "64 Bit detected")
    set(ANALYZER_BITNESS 64)
    set(ANALYZER_LIB_NAME "Analyzer64")
else()
    message(FATAL_ERROR "Environment not supported")
endif()

if(NOT (WIN32 OR UNIX))
    # I have no idea what to do under MacOS
    message(WARNING "Environment may not be supported")
endif()

# find library
find_library(
    ANALYZER_SDK_LIBRARY
    NAMES
    ${ANALYZER_LIB_NAME}
    PATHS
    ../lib/
    ../AnalyzerSDK/lib/
    DOC
    "Analyzer SDK library. \
If you set it yourself, choose the correct architecture"
)

if(NOT ANALYZER_SDK_LIBRARY)
    message(SEND_ERROR "Analyzer SDK library not found")
else()
    message(STATUS "Analyzer SDK library found at ${ANALYZER_SDK_LIBRARY}")
endif()


add_library(SDMMCAnalyzer SHARED
    src/SDMMCAnalyzer.cpp
    src/SDMMCAnalyzer.h
    src/SDMMCAnalyzerResults.cpp
    src/SDMMCAnalyzerResults.h
    src/SDMMCAnalyzerSettings.cpp
    src/SDMMCAnalyzerSettings.h
    src/SDMMCHelpers.cpp
    src/SDMMCHelpers.h
    src/SDMMCSimulationDataGenerator.cpp
    src/SDMMCSimulationDataGenerator.h
)

target_include_directories(SDMMCAnalyzer PUBLIC
    source
    ${ANALYZER_SDK_INCLUDE_DIR}
)

target_link_libraries(SDMMCAnalyzer
    PUBLIC
    ${ANALYZER_SDK_LIBRARY}
)

target_compile_features(SDMMCAnalyzer
    PRIVATE
    cxx_nullptr
)

QSPI协议插件

dedicatedcomputing /saleae_qspi
也类似如上面的操作,添加CMake脚本处理.

STM8S103

Links:
STM8S103是支持SDCC (Small Devices C compiler)编译器的,而且SDCC版本须大于v3.4.0以上。

SDCC示例编译

Arduino支持

编程工具

stm8flash
stm8flash开源工具,是针对stlink硬件烧写的，我这里还是使用openocd+ft2232这样的方式对它进行编程烧写。

其它资源

谢谢支持

微信二维码:

为Atmel-SAM4S-Xplained-Pro编译安装NuttX系统

发表于 2020-09-25 更新于 2022-02-26

编译Nuttx

Atmel SAM4S Xplained Pro

简介

Core
- ARM Cortex-M4 with 2 Kbytes of cache running at up to 120 MHz
- Memory Protection Unit (MPU)
- DSP Instruction Set
- Thumb ® -2 instruction set
Memories
- Up to 2048 Kbytes embedded Flash with optional dual-bank and cache memory, ECC, Security Bit and Lock Bits
- Up to 160 Kbytes embedded SRAM
- 16 Kbytes ROM with embedded boot loader routines (UART, USB) and IAP routines
- 8-bit Static Memory Controller (SMC): SRAM, PSRAM, NOR and NAND Flash support

同步nuttx与nuttx-apps两个源代码，它们两个平级目录.

~$ git clone https://github.com/apache/incubator-nuttx nuttx
~$ git clone https://github.com/apache/incubator-nuttx-apps.git apps

~$ cd nuttx
# 列出所有支持板子.
~$ tools/configure.sh -L | grep "sam"
~$ tools/configure.sh -l  sam4s-xplained-pro:nsh
~$ cp  boards/arm/sam34/sam4s-xplained-pro/configs/nsh/defconfig .config
~$ cp boards/arm/sam34/sam4s-xplained-pro/scripts/Make.defs .

# 这里使用一个 第三方的工具链(gcc-arm-none-eabi-6-2017-q2-update) arm-none-eabi- 也可以编译成功.选择 CONFIG_ARMV7M_TOOLCHAIN_GNU_EABIL=y


~$ export PATH=/fullpath/gcc-arm-none-eabi-6-2017-q2-update/bin:$PATH
~$ make CROSSDEV=arm-none-eabi-

# 查看交㕚工具链支持的CPU特性.
~$ arm-none-eabi-g++ -print-multi-lib
.;
thumb;@mthumb
fpu;@mfloat-abi=hard
armv6-m;@mthumb@march=armv6s-m
armv7-m;@mthumb@march=armv7-m
armv7e-m;@mthumb@march=armv7e-m
armv7-ar/thumb;@mthumb@march=armv7
cortex-m7;@mthumb@mcpu=cortex-m7
armv7e-m/softfp;@mthumb@march=armv7e-m@mfloat-abi=softfp@mfpu=fpv4-sp-d16
armv7e-m/fpu;@mthumb@march=armv7e-m@mfloat-abi=hard@mfpu=fpv4-sp-d16
armv7-ar/thumb/softfp;@mthumb@march=armv7@mfloat-abi=softfp@mfpu=vfpv3-d16
armv7-ar/thumb/fpu;@mthumb@march=armv7@mfloat-abi=hard@mfpu=vfpv3-d16
cortex-m7/softfp/fpv5-sp-d16;@mthumb@mcpu=cortex-m7@mfloat-abi=softfp@mfpu=fpv5-sp-d16
cortex-m7/softfp/fpv5-d16;@mthumb@mcpu=cortex-m7@mfloat-abi=softfp@mfpu=fpv5-d16
cortex-m7/fpu/fpv5-sp-d16;@mthumb@mcpu=cortex-m7@mfloat-abi=hard@mfpu=fpv5-sp-d16
cortex-m7/fpu/fpv5-d16;@mthumb@mcpu=cortex-m7@mfloat-abi=hard@mfpu=fpv5-d16

下面错误，可以打开源码位置，注释掉它.

arch/arm/src/imxrt/Kconfig:1114: syntax error
arch/arm/src/imxrt/Kconfig:1113: invalid option
make: *** [tools/Makefile.unix:471: olddefconfig] Error 1
ERROR: failed to refresh

使用BuildRoot构建指交叉工具链

buildroot nuttx
这里是使用NuttX提供的修改后的BuildRoot版本.通过buildroot编译出一个特殊版本的交叉工具链，再用它去编译出最终的nuttx系统镜像.buildroot的源码目录与nuttx,nuttx-apps也是平级的目录.在实践过程中开启了BR2_GCC_CORTEX_M4F_SP导致下面的DBUG里出错的原因.也就是说Atmel SAM4S Xplained Pro是cortex-m4内核，但是它不带FPU，强选FPU肯定是出错的，

~$ git clone https://bitbucket.org/nuttx/buildroot.git buildroot
~$ cp configs/cortexm4f-eabi-defconfig-4.7.4 .config
# 配置一个合适的版本，这里是:binutils-2.26.1 ,gcc-4.7.4,gdb-8.0.1，因为是cortexm4f-eabi-defconfig-4.7.4，所以这里选择gcc-4.7.4
# 在nuttx配置时，选择 CONFIG_ARMV7M_TOOLCHAIN_BUILDROOT=y，使用第三方如上面是选择 CONFIG_ARMV7M_TOOLCHAIN_GNU_EABIL=y
~$ make menuconfig
~$ make
# 这里最终的配置是如下:
~$  grep -v '^$\|^#' .config
BR2_HAVE_DOT_CONFIG=y
BR2_arm=y
BR2_cortex_m3=y
BR2_GCC_CORTEX=y
BR2_ARM_EABI=y
BR2_ARCH="arm"
BR2_GCC_TARGET_TUNE="cortex-m3"
BR2_GCC_TARGET_ARCH="armv7-m"
BR2_GCC_TARGET_ABI="aapcs-linux"
BR2_WGET="wget --passive-ftp"
BR2_SVN="svn co"
BR2_ZCAT="zcat"
BR2_BZCAT="bzcat"
BR2_TAR_OPTIONS=""
BR2_DL_DIR="$(BASE_DIR)/dl"
BR2_STAGING_DIR="$(BUILD_DIR)/staging_dir"
BR2_NUTTX_DIR="$(TOPDIR)/../nuttx"
BR2_TOPDIR_PREFIX=""
BR2_TOPDIR_SUFFIX=""
BR2_GNU_BUILD_SUFFIX="pc-elf"
BR2_GNU_TARGET_SUFFIX="nuttx-eabi"
BR2_PREFER_IMA=y
BR2_PACKAGE_BINUTILS=y
BR2_BINUTILS_VERSION_2_26_1=y
BR2_BINUTILS_VERSION="2.26.1"
BR2_EXTRA_BINUTILS_CONFIG_OPTIONS=""
BR2_PACKAGE_GCC=y
BR2_GCC_VERSION_4_7_4=y
BR2_GCC_SUPPORTS_SYSROOT=y
BR2_GCC_SUPPORTS_DOWN_PREREQ=y
BR2_GCC_DOWNLOAD_PREREQUISITES=y
BR2_GCC_VERSION="4.7.4"
BR2_EXTRA_GCC_CONFIG_OPTIONS=""
BR2_INSTALL_LIBSTDCPP=y
BR2_PACKAGE_GDB_HOST=y
BR2_GDB_VERSION_8_0_1=y
BR2_PACKAGE_GDB_TUI=y
BR2_GDB_VERSION="8.0.1"
BR2_PACKAGE_GENROMFS=y
BR2_PACKAGE_KCONFIG_FRONTENDS=y
BR2_KCONFIG_VERSION_4_11_0_1=y
BR2_KCONFIG_FRONTENDS_VERSION="4.11.0.1"
BR2_LARGEFILE=y
BR2_TARGET_OPTIMIZATION="-Os -pipe"

# 编译成功后可以使用: arm-nutxx-eabi-g++ -print-multi-lib
BR2_ENABLE_MULTILIB=y
BR2_LARGEFILE=y
BR2_SOFT_FLOAT=y
BR2_TARGET_OPTIMIZATION="-Os -pipe"

如果成功会生成如下的目录.注意，如果开启了BR2_ENABLE_MULTILIB=y与BR2_SOFT_FLOAT=y生成的目标目录是build_arm_nofpu，否则生成的目录就是build_arm.

~$ tree -L 2 build_arm_hf/
build_arm_hf/
├── root
└── staging_dir
    ├── arm-elf -> arm-nuttx-eabi
    ├── arm-nuttx-eabi
    ├── bin
    ├── include
    ├── lib
    ├── libexec
    ├── share
    └── usr

10 directories, 0 files

测试工具链

~$ export PATH=`pwd`/build_arm_hr/staging_dir/bin:$PATH
~$ arm-nuttx-eabi-g++ -print-multi-lib
.;
thumb;@mthumb
fpu;@mfloat-abi=hard

如果出现下面的错误，在make menuconfig选择一个高版本的gdb尝试一下.

/fullpath/buildroot/toolchain_build_arm_hf/gdb-7.9.1/gdb/python/python.c: In function ‘_initialize_python’:
/fullpath/buildroot/toolchain_build_arm_hf/gdb-7.9.1/gdb/python/python.c:1690:3: error: too few arguments to function ‘_PyImport_FixupBuiltin’
   _PyImport_FixupBuiltin (gdb_module, "_gdb");

如果出现gcc编译时无法下载mpc,mpfr,gmp的依赖包时，查看一下toolchain_build_arm_hf/gcc-4.9.4/contrib/download_prerequisites.修改版本号，或者下载的地址.

gcc-4.7.4的补丁

在构建交㕚工具链时如果出现下面的错误

In file included from .../gcc-4.7.4/gcc/cp/except.c:990:0:
cfns.gperf: At top level:
cfns.gperf:101:1: error: 'gnu_inline' attribute present on 'libc_name_p'
cfns.gperf:26:14: error: but not here

这里引用了cfns: fix mismatch in gnu_inline attributes的补丁.直接创建在BuildRoot的源码里.

~$ cat > toolchain/gcc/4.7.4/gnu_inline.patch <<EOF
diff --git a/gcc/cp/cfns.gperf b/gcc/cp/cfns.gperf
index 68acd3d..953262f 100644
--- a/gcc/cp/cfns.gperf
+++ b/gcc/cp/cfns.gperf
@@ -22,6 +22,9 @@  __inline
 static unsigned int hash (const char *, unsigned int);
 #ifdef __GNUC__
 __inline
+#ifdef __GNUC_STDC_INLINE__
+__attribute__ ((__gnu_inline__))
+#endif
 #endif
 const char * libc_name_p (const char *, unsigned int);
 %}
diff --git a/gcc/cp/cfns.h b/gcc/cp/cfns.h
index 1c6665d..6d00c0e 100644
--- a/gcc/cp/cfns.h
+++ b/gcc/cp/cfns.h
@@ -53,6 +53,9 @@  __inline
 static unsigned int hash (const char *, unsigned int);
 #ifdef __GNUC__
 __inline
+#ifdef __GNUC_STDC_INLINE__
+__attribute__ ((__gnu_inline__))
+#endif
 #endif
 const char * libc_name_p (const char *, unsigned int);
 /* maximum key range = 391, duplicates = 0 */
 EOF

GCC-4.7.4的编译错误

make[4]: Entering directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build/libiberty/testsuite'
make[4]: Nothing to be done for 'install'.
make[4]: Leaving directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build/libiberty/testsuite'
make[3]: Leaving directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build/libiberty'
/bin/bash: line 3: cd: arm-nuttx-eabi/libgcc: No such file or directory
make[2]: *** [Makefile:10334: install-target-libgcc] Error 1
make[2]: Leaving directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build'
make[1]: *** [Makefile:2115: install] Error 2
make[1]: Leaving directory '/fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build'
make: *** [toolchain/gcc/gcc-nuttx-4.x.mk:159: /fullpath/buildroot/toolchain_build_arm_nofpu/gcc-4.7.4-build/.installed] Error 2

编译时出现上面的错误，这好像是这个GCC版本的问题，在4.9.x好像没有出这样的错误，同是查看了各级的config.log与各级的Makefile也没有找出问题，后来只有进入到编译的目录内toolchain_build_arm/gcc-4.7.4-build内直接运行make,无错的话再回到buildroot内，再运行make这里就正常完成了.

配置`NuttX`系统

~$ tools/configure.sh -l  sam4s-xplained-pro:nsh
~$ cp  boards/arm/sam34/sam4s-xplained-pro/configs/nsh/defconfig .config
~$ cp boards/arm/sam34/sam4s-xplained-pro/scripts/Make.defs .
~$ make menuconfig
~$ make
[...]
LD: nuttx
make[1]: Leaving directory '/fullpath/nuttx/arch/arm/src'
CP: nuttx.hex
CP: nuttx.bin

Nuttx的最终配置如下:

1	~$ grep -v '^$\\|^#' .config

编译时出现下面错误，这个有可能是配置时没有选择CONFIG_ARCH_IRQPRIO=y，然后就去make后的结果.通过搜索nuttx源码时发现，getcontrol(void)是定义在arch/arm/include/armv7-m/irq.h里的内联函数.最终在arch/arm/src/chip/sam_start.c前面，添加#include <nuttx/irq.h>就可以了.

1
2
3

/fullpath/nuttx/staging/libarch.a(sam_start.o): In function `sam_fpuconfig':
/fullpath/nuttx/arch/arm/src/chip/sam_start.c:159: undefined reference to `getcontrol`
/fullpath/nuttx/arch/arm/src/chip/sam_start.c:161: undefined reference to `setcontrol`

使用OpenOCD调试

编译OpenOCD

~$ git clone http://repo.or.cz/r/openocd.git
~$ cd openocd && mkdir build-linux && cd build-linux
~$ ../configure --enable-ftdi --enable-stlink --enable-ti-icdi --enable-ulink --enable-usb-blaster-2 --enable-ft232r --enable-xds110  --enable-usbprog --enable-armjtagew --enable-cmsis-dap --enable-usb-blaster  --enable-openjtag --enable-jlink --enable-bcm2835gpio --enable-imx_gpio --enable-oocd_trace --enable-buspirate --enable-sysfsgpio
~$ make && make install

根据Nuttx官方的Debugging Nuttx提示，基于某些特性调试，可能需要修改相应的openocd/src/rtos/nuttx_header.h内的宏定义如:CONFIG_DISABLE_MQUEUE=y.
运行OpenOCD服务,通过PC端的USB连接到Atmel-SAM4S_Xpained-Pro上写有DEBUG USB的接口上,注意，该USB接口是板上调试器(EDBG)的组合接口，它包含三个接口功能:DEBUG,Virtual COM Port, Data Gateway Interface(DGI).

# 这里如果定义板级配置也可直接: openocd -f board/atmel_sam4s_xplained_pro.cfg -c init -c "reset halt"
~$ openocd -f interface/cmsis-dap.cfg -f target/at91sam4sd32x.cfg -c init -c "reset halt"
Open On-Chip Debugger 0.10.0+dev-01408-g762ddcb74-dirty (2020-09-25-00:32)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "swd". To override use 'transport select <transport>'.
Info : CMSIS-DAP: SWD  Supported
Info : CMSIS-DAP: JTAG Supported
Info : CMSIS-DAP: FW Version = 1.0
Info : CMSIS-DAP: Serial# = ATML1803040200001055
Info : CMSIS-DAP: Interface Initialised (SWD)
Info : SWCLK/TCK = 1 SWDIO/TMS = 1 TDI = 1 TDO = 1 nTRST = 0 nRESET = 1
Info : CMSIS-DAP: Interface ready
Info : clock speed 500 kHz
Info : SWD DPIDR 0x2ba01477
Info : sam4.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for sam4.cpu on 3333
Info : Listening on port 3333 for gdb connections
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x00400554 msp: 0x200034d0
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : accepting 'telnet' connection on tcp/4444
Info : dropped 'telnet' connection
Info : accepting 'telnet' connection on tcp/4444
Info : dropped 'telnet' connection
Info : accepting 'gdb' connection on tcp/3333
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x00400554 msp: 0x200034d0

使用GDB桥接到OpenOCD服务上去调试.

~$ arm-nuttx-eabi-gdb nuttx
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-elf --target=arm-nuttx-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from nuttx...done.
(gdb) target extended-remote :3333
0x00400554 in arm_earlyserialinit () at chip/sam_serial.c:1345
1345	  up_disableallints(TTYS1_DEV.priv, NULL);
(gdb) load   # 加载到板子上去,也就烧写入板子.
# OpenOCD的终端上也有相应的信息输出.
Loading section .text, size 0x19003 lma 0x400000
Loading section .ARM.extab, size 0x30 lma 0x419004
Loading section .ARM.exidx, size 0xd0 lma 0x419034
Loading section .data, size 0x220 lma 0x419104
Start address 0x4000cc, load size 103203
Transfer rate: 18 KB/sec, 10320 bytes/write.
(gdb) cont  #继续运行，这里出现异常了，这人错误的示例，是因为工具链选择了`BR2_GCC_CORTEX_M4F_SP=y`,与目标板子不匹配，SAM4S-XPRO它是Cortex-m4但是它没有FPU.
Continuing.
sam4.cpu -- clearing lockup after double fault

Program received signal SIGINT, Interrupt.
exception_common () at armv7-m/gnu/arm_exception.S:176
176		vstmdb	sp!, {s16-s31}			/* Save the non-volatile FP context */
(gdb) i r  # 详细GDB的指令，参考该版本的说明文档,最权威，最详细.
r0             0x3	3
r1             0x1	1
r2             0x20001e78	536878712
[....]

也可以把几个参数合在一行命令下提交，如arm-nuttx-eabi-gdb -ex "target remote :3333" -ex "mon reset halt" nuttx
关于类似ATSAM4SD32C.cpu -- clearing lockup after double fault,参考了stackexchange处理.
查看目标文件的内容.

~$ file nuttx
nuttx: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped
~$ file nuttx.bin
nuttx.bin: data
~$ file nuttx.hex
nuttx.hex: ASCII text, with CRLF line terminators

GDB调试

`OpenOCD`服务

1
2
3

# ~$ openocd -f board/atmel_sam4s_xplained_pro.cfg -c init -c 'reset halt' -c '$_TARGETNAME configure -rtos nuttx'
~$ openocd -f board/atmel_sam4s_xplained_pro.cfg -c '$_TARGETNAME configure -rtos nuttx'

GDB连接

在nuttx的目录下运行，为了加载nuttx文件.

1	~ nuttx$ arm-none-eabi-gdb -ex "target remote :3333" -ex "mon reset halt" nuttx

定义一些gdb hook函数，只是为了方便一些调试,最好的方式是把这些define放在~/.gdbinit内, 但是要注意，你如果在用系统gdb去调非nuttx的程序时，
记得要删了~/.gdbinit.

(gdb) define hookpost-file
Type commands for definition of "hookpost-file".
End with a line saying just "end".
>  eval "monitor nuttx.pid_offset %d", &((struct tcb_s *)(0))->pid
>  eval "monitor nuttx.xcpreg_offset %d", &((struct tcb_s *)(0))->xcp.regs
>  eval "monitor nuttx.state_offset %d", &((struct tcb_s *)(0))->task_state
>  eval "monitor nuttx.name_offset %d", &((struct tcb_s *)(0))->name
>  eval "monitor nuttx.name_size %d", sizeof(((struct tcb_s *)(0))->name)
>end

连接远程openocd的端口

(gdb) target extended-remote :3333
Remote debugging using :3333
__start () at chip/sam_start.c:269
269	{
(gdb) file nuttx

上线的file nuttx是加载文件的symbols，因为定义了hook-file,所以会在openocd内显示，如:

Error: No symbols for NuttX  # 有可能会显示，如果每次都显示，是因为没有打开`CONFIG_DEBUG_SYMBOLS`
Info : pid_offset: 12
Info : xcpreg_offset: 132
Info : state_offset: 26
Info : name_offset: 208
Info : name_size: 16

查看调试线程信息.

(gdb) info threads
warning: while parsing threads: not well-formed (invalid token)
  Id   Target Id         Frame
* 1    Remote target     __start () at chip/sam_start.c:269

查看寄存器的信息

(gdb) info registers
r0             0x0	0
r1             0x20004360	536888160
r2             0x20004360	536888160
r3             0x1	1
r4             0xc	12
r5             0x200010cc	536875212
r6             0x200010cc	536875212
r7             0x3	3
r8             0x0	0
r9             0x0	0
r10            0x0	0
r11            0x0	0
r12            0x20004290	536887952
sp             0x20004340	0x20004340
lr             0x8012621	134293025
pc             0x8003b4c	0x8003b4c <memcpy+20>
xPSR           0x61000000	1627389952

设置断点

Breakpoints
下面设置的断点是在文件的某行上面，为防止调试时程序跑飞，硬件重启后又需要重新设置断点，这里保存断点到bp.txt文件上，下次可以使用source bp.txt加载.

(gdb) b chip/sam_hsmci.c:757
Breakpoint 1 at 0x417cf8: file chip/sam_hsmci.c, line 757.
(gdb) b mmcsd/mmcsd_sdio.c:2780
Breakpoint 2 at 0x415fc4: file mmcsd/mmcsd_sdio.c, line 2780.
(gdb) save breakpoints bp.txt
Saved to file 'bp.txt'.
(gdb) rbreak bp.txt
(gdb) info b
Num     Type           Disp Enb Address    What
1       breakpoint     keep y   0x00417cf8 in sam_clock at chip/sam_hsmci.c:757
2       breakpoint     keep y   0x00415fc4 in mmcsd_probe at mmcsd/mmcsd_sdio.c:2780

查看调用栈，有时调试板子，在板子初始硬件时，串口打印还没有就绪时，此时用backtrace查看调用栈很有用，比如:晶振不起振.

(gdb) backtrace
#0  sam_clock (dev=0x2000016c <g_sdiodev>, rate=CLOCK_SD_TRANSFER_1BIT)
    at chip/sam_hsmci.c:1580
#1  0x00415e56 in mmcsd_sdinitialize (priv=0x200043b0) at mmcsd/mmcsd_sdio.c:3023
#2  mmcsd_probe (priv=priv@entry=0x200043b0) at mmcsd/mmcsd_sdio.c:3465
#3  0x004162d2 in mmcsd_mediachange (arg=0x200043b0) at mmcsd/mmcsd_sdio.c:2545
#4  0x004185f2 in sdio_mediachange (dev=0x2000016c <g_sdiodev>, cardinslot=<optimized out>)
    at chip/sam_hsmci.c:2799
#5  0x004148e4 in sam_hsmci_initialize () at sam_hsmci.c:180
#6  0x0041478a in board_app_initialize (arg=arg@entry=0) at sam_appinit.c:129
#7  0x004120fa in boardctl (cmd=cmd@entry=65281, arg=arg@entry=0) at boardctl.c:326
#8  0x00405cfa in nsh_initialize () at nsh_init.c:103
#9  0x00405ccc in nsh_main (argc=1, argv=0x200059c8) at nsh_main.c:143
#10 0x00403858 in nxtask_startup (entrypt=0x405ca9 <nsh_main>, argc=1, argv=0x200059c8)
    at sched/task_startup.c:165
#11 0x00401320 in nxtask_start () at task/task_start.c:144
#12 0x00000000 in ?? ()

单步(s = Step into, n = Step over),单步步入(Step Into)有时会进入很深的调用栈，如果要退回可以使用finish指令跳出.

(gdb) step
100	      ret = nxsig_nanosleep(&rqtp, &rmtp);
(gdb) stepi
nxsig_nanosleep (rqtp=rqtp@entry=0x20004330, rmtp=rmtp@entry=0x20004338) at signal/sig_nanosleep.c:108
108	{
(gdb) n
116	  if (rqtp == NULL || rqtp->tv_nsec < 0 || rqtp->tv_nsec >= 1000000000)

查看变量

btle_main (argc=<error reading variable: value has been optimized out>, argv=<error reading variable: value has been optimized out>) at nrf24l01_btle.c:372
372	      memcpy(chunk(buffer,pls)->data,&hum,2);
(gdb) p hum
$5 = 1844
(gdb) p temp
$6 = 3139
(gdb) p buffer.playload
There is no member named playload.
(gdb) x buffer.payload
0x200010d4 <buffer+8>:	0x06000102
(gdb) x/32b buffer.payload
0x200010d4 <buffer+8>:	2	1	0	6	9	0	0	0
0x200010dc <buffer+16>:	0	0	7	22	0	0	0	0
0x200010e4 <buffer+24>:	0	0	40	7	-56	0	0	0
0x200010ec <current>:	0	0	0	0	0	0	0	0
(gdb) x/32x buffer.payload     # hex格式数组.
0x200010d4 <buffer+8>:	0x02	0x01	0x00	0x06	0x09	0x00	0x00	0x00
0x200010dc <buffer+16>:	0x00	0x00	0x07	0x16	0x00	0x00	0x00	0x00
0x200010e4 <buffer+24>:	0x00	0x00	0x28	0x07	0xc8	0x00	0x00	0x00
0x200010ec <current>:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
(gdb) print /x buffer.payload  # 使用打印查看数组.
$2 = {0x2, 0x1, 0x0, 0x6, 0x9, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7, 0x16, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x28, 0x7, 0xc8, 0x0, 0x0, 0x0}

hexdump 查看地址，数组

(gdb) x /32bx data
0x20004374:	0x44	0x12	0x00	0x20	0x20	0x00	0x00	0x00
0x2000437c:	0x95	0x3a	0x01	0x08	0x44	0x12	0x00	0x20
0x20004384:	0x03	0x00	0x00	0x00	0x04	0x00	0x00	0x00
0x2000438c:	0x07	0x5d	0x01	0x08	0x10	0x00	0x00	0x00

查看结构体的内容

(gdb) set print pretty on
(gdb) p dev
$4 = (struct nrf24l01_dev_s *) 0x20003370
(gdb) p *dev
$5 = {
  spi = 0x20000174 <g_spi2dev>,
  config = 0x20000140 <nrf_cfg>,
  state = ST_STANDBY,
  tx_payload_noack = 0 '\000',
  en_aa = 63 '?',
  en_pipes = 1 '\001',
  ce_enabled = 0 '\000',
  lastxmitcount = 0 '\000',
  addrlen = 5 '\005',
  pipedatalen = "!\000\000\000\000",
  pipe0addr = "\001\312\376\022\064",
  last_recvpipeno = 0 '\000',
  sem_tx = {
    semcount = 0
  },
  tx_pending = 1 '\001',
  rx_fifo = 0x200033d0 "h\020",
  fifo_len = 0,
  nxt_read = 0,
  nxt_write = 0,
 [.....]

烧写固件

烧写flash0,地址0x00400000在链接脚本文件boards/arm/sam34/sam4s-xplained-pro/scripts/sam4s-xplained-pro.ld里有定义.

~$ openocd -f board/atmel_sam4s_xplained_pro.cfg -c init -c "reset halt" -c "flash write_image erase nuttx.bin 0x00400000" -c "reset"
Open On-Chip Debugger 0.10.0+dev-01408-g762ddcb74-dirty (2020-09-25-00:32)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "swd". To override use 'transport select <transport>'.
Info : CMSIS-DAP: SWD  Supported
Info : CMSIS-DAP: JTAG Supported
Info : CMSIS-DAP: FW Version = 1.0
Info : CMSIS-DAP: Serial# = ATML1803040200001055
Info : CMSIS-DAP: Interface Initialised (SWD)
Info : SWCLK/TCK = 1 SWDIO/TMS = 1 TDI = 1 TDO = 1 nTRST = 0 nRESET = 1
Info : CMSIS-DAP: Interface ready
Info : clock speed 500 kHz
Info : SWD DPIDR 0x2ba01477
Info : ATSAM4SD32C.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for ATSAM4SD32C.cpu on 3333
Info : Listening on port 3333 for gdb connections
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x004000cc msp: 0x20001ee4
Info : sam4 does not auto-erase while programming (Erasing relevant sectors)
Info : sam4 First: 0x00000000 Last: 0x0000000c
Info : Erasing sector: 0x00000000
Info : Erasing sector: 0x00000001
Info : Erasing sector: 0x00000002
Info : Erasing sector: 0x00000003
Info : Erasing sector: 0x00000004
Info : Erasing sector: 0x00000005
Info : Erasing sector: 0x00000006
Info : Erasing sector: 0x00000007
Info : Erasing sector: 0x00000008
Info : Erasing sector: 0x00000009
Info : Erasing sector: 0x0000000a
Info : Erasing sector: 0x0000000b
Info : Erasing sector: 0x0000000c
auto erase enabled
wrote 106496 bytes from file nuttx.bin in 5.418800s (19.192 KiB/s)

连接到它的UART接口，进入系统.

1
2
3

~$ sudo minicom -o -b 115200 -D /dev/ttyACM0
NuttShell (NSH) NuttX-9.1.0
nsh> mm  #测试一下内存.

NAND 移植

sam_nand_initialize: CS0
nand_initialize: cmdaddr=0x60400000 addraddr=0x60200000 dataaddr=0x60000000
onfi_ebidetect: cmdaddr=60400000 addraddr=60200000 dataaddr=60000000
onfi_read: cmdaddr=60400000 addraddr=60200000 dataaddr=60000000
onfi_read: Returning:
onfi_read:   manufacturer:  0x2c
onfi_read:   buswidth:      0
onfi_read:   luns:          1
onfi_read:   eccsize:       4
onfi_read:   model:         0x @
onfi_read:   sparesize:     64
onfi_read:   pagesperblock: 64
onfi_read:   blocksperlun:  2048
onfi_read:   pagesize:      2048
nand_initialize: Found ONFI compliant NAND FLASH
nand_devscan: Retrieving bad block information. nblocks=2048

读/写页(page),擦除块(block).在概念上，由大到小来说，就是:

1	Nand Flash ⇒ Chip ⇒ Plane ⇒ Block ⇒ Page ⇒ oob

 hexdump /dev/mtdblock0 count=128
/dev/mtdblock0 at 00000000:
0000: eb 3c 90 4e 55 54 54 58 20 20 20 00 08 20 01 00 .<.NUTTX   .. ..
0010: 02 00 02 00 00 f8 05 00 3f 00 ff 00 00 00 00 00 ........?.......
0020: 00 00 02 00 00 00 29 00 00 00 00 20 20 20 20 20 ......)....
0030: 20 20 20 20 20 20 46 41 54 31 36 20 20 20 0e 1f       FAT16   ..
0040: be 5b 7c ac 22 c0 74 0b 56 b4 0e bb 07 00 cd 10 .[|.\".t.V.......
0050: 5e eb f0 32 e4 cd 16 cd 19 eb fe 54 68 69 73 20 ^..2.......This
0060: 69 73 20 6e 6f 74 20 61 20 62 6f 6f 74 61 62 6c is not a bootabl
0070: 65 20 64 69 73 6b 2e 20 20 50 6c 65 61 73 65 20 e disk.  Please

格式化nuttx代码

1	git status \| grep "modified:" \| awk '{print $2}' \| xargs tools/checkpatch.sh -f

内存分布SAMA5D3 Series是参照Datasheet第5章Memories. SAM4S是SAM4S Datasheet Chapter 6 . Product Mapping

Arduino Due

Arduino Due
Hacking with the Arduino Due
SAM3X-Arduino Pin Mapping
The Arduino Due is a microcontroller board based on the Atmel SAM3X8E ARM Cortex-M3 CPU. It is the first Arduino board based on a 32-bit ARM core microcontroller. It has 54 digital input/output pins (of which 12 can be used as PWM outputs), 12 analog inputs, 4 UARTs (hardware serial ports), a 84 MHz clock, an USB OTG capable connection, 2 DAC (digital to analog), 2 TWI, a power jack, an SPI header, a JTAG header, a reset button and an erase button.
根据官方警示:Arduino Due的管脚只容3.3v,高于3.3v会损坏板子.

BOSSAC烧写

这里将使用它官方的烧写工具BOSSAC,它一般位于用户目录下.如:~/.arduino15/packages/arduino/tools/bossac/1.7.0-arduino3/bossac.

检测目标板子信息

1 2	~$ bossac -p ttyACM0 -U false -i No device found on ttyACM0

需要设置正确的端口参数,如果再不行，按reset再重试，如下:

~$ stty -F /dev/ttyACM0 speed 1200 cs8 -cstopb -parenb
115200
~ $ bossac -p ttyACM0 -U false -i
Atmel SMART device 0x285e0a60 found
Device       : ATSAM3X8
Chip ID      : 285e0a60
Version      : v1.1 Dec 15 2010 19:25:04
Address      : 524288
Pages        : 2048
Page Size    : 256 bytes
Total Size   : 512KB
Planes       : 2
Lock Regions : 32
Locked       : none
Security     : false
Boot Flash   : false

烧入`nuttx.bin`

$ bossac -p ttyACM0 -U false -e -w -v -b nuttx.bin -R
Atmel SMART device 0x285e0a60 found
Erase flash
done in 0.041 seconds

Write 62368 bytes to flash (244 pages)
[==============================] 100% (244/244 pages)
done in 12.866 seconds

Verify 62368 bytes of flash
[==============================] 100% (244/244 pages)
Verify successful
done in 12.240 seconds
Set boot flash true
CPU reset.

使用SWD烧写

除了使用BOSSAC，本来想使用SWD接口与OpenOCD来烧写调试.因为Due板有一个排4针的DEBUG接口，针脚是:1:RESET,2:SWDIO,3:SWCLK,4:GND.

~$ cat > ~/sam3x8e.cfg<<EOF
source [find interface/stlink.cfg]

set CPUTAPID 0x2ba01477

source [find board/atmel_sam3x_ek.cfg]
EOF

~$ openocd -f ~/sam3x8e.cfg  -c init -c halt -c "flash write_image erase nuttx.bin 0x80000" -c "at91sam3 gpnvm set 1" -c "exit"

使用板载`AT16u2`烧写

at16u2_cmsis_dap
在网上发现可以修改AT16u2的固件，使它成为CMSIS_DAP,从而可以支持openocd的烧写调试.

更新AT16u2的固件

Upgrading16U2Due
烧写AVR芯片需要一个烧写器，如:avr JATG-ICE, AVR-ISP,Atmel-ICE,USBasp.也可以把一块arduino板子变成AVR-ISP.如:Arduino Uno. 但是这里有一块NUCLEO-L152RE它有兼容arduino的接口，可以使用stm32duino把它变成AVR-ISP烧写器,烧入官方的ArduinoISP进去.
打开Arduino IDE --> File --> examples(Builtin-examples) --> 11.ArduinoISP --> ArduinoISP,烧写上传到NUCLEO-L152RE.

NUCLEO-L152RE        Arduino Due (ICSP)
  D10 CS     <------>   Reset
  D11 MOSI   <------>   MOSI
  D12 MISO   <------>   MISO
  D13 SCK    <------>   SCK
      GND    <------>   GND
      +5V    <------>   +5V

avrdude

下载avrdude源码，avrdude是一个开源的AVR烧写软件，它最新地板本是avrdude-6.3,通过源码可以查看它所支持的硬件详情.

下面是使用NUCLEO-L152RE做为一个烧写器，按照上面接线，对Arduino Due板上的at16u2的固件进行更新.

~$ cd   ~/.arduino15/packages/arduino/tools/avrdude/6.3.0-arduino17/
~$ tree
.
├── bin
│   └── avrdude
└── etc
    └── avrdude.conf

~$ avrdude -c arduino  -P /dev/ttyACM0 -b 19200 -p atmega16u2 -vvv -U flash:w:at16u2_cmsis_dap/at16u2_cmsis_dap.elf.hex:i

avrdude: Version 6.3-20171130
         Copyright (c) 2000-2005 Brian Dean, http://www.bdmicro.com/
         Copyright (c) 2007-2014 Joerg Wunsch

         System wide configuration file is "/etc/avrdude.conf"
         User configuration file is "/home/michael/.avrduderc"
         User configuration file does not exist or is not a regular file, skipping

         Using Port                    : /dev/ttyACM0
         Using Programmer              : arduino
         Overriding Baud Rate          : 19200
         AVR Part                      : ATmega16U2
         Chip Erase delay              : 9000 us
         PAGEL                         : PD7
         BS2                           : PC6
         RESET disposition             : possible i/o
         RETRY pulse                   : SCK
         serial program mode           : yes
         parallel program mode         : yes
         Timeout                       : 200
         StabDelay                     : 100
         CmdexeDelay                   : 25
         SyncLoops                     : 32
         ByteDelay                     : 0
         PollIndex                     : 3
         PollValue                     : 0x53
         Memory Detail                 :

                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           eeprom        65    20     4    0 no        512    4    128  9000  9000 0x00 0x00
                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           flash         65     6   128    0 yes     16384  128    128  4500  4500 0x00 0x00
                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           lfuse          0     0     0    0 no          1    0      0  9000  9000 0x00 0x00
                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           hfuse          0     0     0    0 no          1    0      0  9000  9000 0x00 0x00
                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           efuse          0     0     0    0 no          1    0      0  9000  9000 0x00 0x00
                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           lock           0     0     0    0 no          1    0      0  9000  9000 0x00 0x00
                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           calibration    0     0     0    0 no          1    0      0     0     0 0x00 0x00
                                  Block Poll               Page                       Polled
           Memory Type Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
           ----------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
           signature      0     0     0    0 no          3    0      0     0     0 0x00 0x00

         Programmer Type : Arduino
         Description     : Arduino
         Hardware Version: 2
         Firmware Version: 1.18
         Topcard         : Unknown
         Vtarget         : 0.0 V
         Varef           : 0.0 V
         Oscillator      : Off
         SCK period      : 0.1 us

跳线

因为Due板上的GND,RESET已经连接到AT16u2上了，这里Due板内只需两根跳线就够了.所以此时的at16u2就是一个CMSIS-DAP的设备了.

 ICSP            DEBUG(SWD)

SCK (3)  <----->   SCK (2)
MOSI(4)  <----->   SDO (3)

使用OpenOCD烧写.

1	~$ openocd -f interface/cmsis-dap.cfg -f board/atmel_sam3x_ek.cfg -c init -c halt -c "flash write_image erase nuttx.bin 0x80000" -c "at91sam3 gpnvm set 1" -c "exit"

使用`UM232H(FTDI)`做烧写器

上面的做法是使用一块arduino板做为烧写器，这里是使用FT232H USB to SPI/I2C的小板来做为烧写器,连接板上的DEBUG(SWD).

# FT232HQ minimodule channel 0 (Channel A)
# Connector  FTDI               Arduino Due(SWD)
# Pin        Name
# ---------  ------             ------
# CN2-10      GND                GND   (pin1)
# CN2-13     ADBUS0 (TCK)       SWCLK  (pin2)
# CN2-14     ADBUS2 (TDI/TDO)   SWDIO  (pin3)
# CN2-15     ADBUS1 (TDO/TDI)   SWDIO  (pin3)
# CN2-17     ADBUS4 (GPIOL0)    nTRST  (pin4)

也可以使用Due板上的标准的ARM-JTAG-10pin去调试，接线的方式如同上面，可以使用JTAG信号，也可以只接SWD信号.如果转接到一个20Pin的板上，就需要对接相应的信号线.
使用OpenOCD烧写,如果板上原来是arduino镜像，需要按下reset后，马上运行下面命令.

1	~$ openocd -f interface/ftdi/ft232h-module-swd.cfg -f board/atmel_sam3x_ek.cfg -c init -c halt -c "flash write_image erase nuttx.bin 0x80000" -c "at91sam3 gpnvm set 1" -c "exit"

OpenOCD连接

其它

Write code to FLASH don’t change boot mode and don’t reset. This lets
you examine the FLASH contents that you just loaded while the bootloader
is still active.

~$ bossac.exe --port=COM26 --usb-port=false -e -w -v --boot=0 nuttx.bin
Write 64628 bytes to flash
[==============================] 100% (253/253 pages)
Verify 64628 bytes of flash
[==============================] 100% (253/253 pages)
Verify successful

Verify the FLASH contents (the bootloader must be running)

~$ bossac.exe --port=COM26 --usb-port=false -v nuttx.bin
Verify 64628 bytes of flash
[==============================] 100% (253/253 pages)
Verify successful

Read from FLASH to a file (the bootloader must be running):

1
2
3

~$ bossac.exe --port=COM26 --usb-port=false --read=4096 nuttx.dump
Read 4096 bytes from flash
[==============================] 100% (16/16 pages)

Change to boot from FLASH

1 2	~$ bossac.exe --port=COM26 --usb-port=false --boot=1 Set boot flash true

恢复`AT16u2`的固件

ArduinoCore-sam
下面是把Arduino Due板上的AT16u2恢复它原来的固件功能，这里是使用FT232H连它的ICSP而不是用一个Arduino板来做烧写器.arduino它所有支持的固件都在它的安装包内,因为Arduino Due是SAM3X8E的芯片，这里选择查看~/.arduino15/packages/arduino/hardware/sam/目录.如下.

~$ tree ~/.arduino15/packages/arduino/hardware/sam/1.6.12/firmwares
.arduino15/packages/arduino/hardware/sam/1.6.12/firmwares
└── atmega16u2
    ├── Arduino-DUE-usbserial-prod-firmware-2012-11-05.hex
    ├── Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex
    └── arduino-usbserial
        ├── Arduino-usbserial.c
        ├── Arduino-usbserial.h
        ├── Board
        │   └── LEDs.h
        ├── Descriptors.c
        ├── Descriptors.h
        ├── Lib
        │   └── LightweightRingBuff.h
        ├── makefile
        └── readme.txt

4 directories, 10 files

关于如何接线的问题，可以查看/etc/avrdude.conf，根据里面的注释指导，结合自已板子接线.


    FT232H              Arduino Due (ICSP)

pin15 ADBUS2   <------>   MISO  pin1
        +5V    <------>   +5V      2
pin13 ADBUS0   <------>   SCK      3
pin14 ADBUS1   <------>   MOSI     4
pin16 ADBUS3   <------>   Reset    5
        GND    <------>   GND      6

烧写命令如下，如有问题可以，添加-vvv烧写查看.

~$ avrdude -C /etc/avrdude.conf -c UM232H -P /dev/ttyUSB0 -b 19200 -p atmega16u2 -U flash:w:Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex:i

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9489 (probably m16u2)
avrdude: NOTE: "flash" memory has been specified, an erase cycle will be performed
         To disable this feature, specify the -D option.
avrdude: erasing chip
avrdude: reading input file "Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex"
avrdude: writing flash (4314 bytes):

Writing | ################################################## | 100% 7.56s

avrdude: 4314 bytes of flash written
avrdude: verifying flash memory against Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex:
avrdude: load data flash data from input file Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex:
avrdude: input file Arduino-DUE-usbserial-prod-firmware-2013-02-05.hex contains 4314 bytes
avrdude: reading on-chip flash data:

Reading | ################################################## | 100% 7.28s

avrdude: verifying ...
avrdude: 4314 bytes of flash verified

avrdude: safemode: Fuses OK (E:F4, H:D9, L:FF)

avrdude done.  Thank you.

谢谢支持

微信二维码:

CSV与EXCEL的文件处理

发表于 2020-09-16 更新于 2022-02-26

处理CSV

示例文件的如下：

$ head  TFWP_2020Q1_Positive_EN.csv
"Employers Who Were Issued a Positive Labour Market Impact Assessment (LMIA) by Program Stream, National Occupational Classification (NOC) 2011 and Business Location, January to March 2020",,,,,
Province/Territory,Program Stream,Employer ,Address,Occupation,Approved Positions
Newfoundland and Labrador,    High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0","2273-Deck officers, water transport",4
Newfoundland and Labrador,    High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0","2274-Engineer officers, water transport",4
Newfoundland and Labrador,    High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0",7242-Industrial electricians,1
Newfoundland and Labrador,    High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0",7532-Water transport deck and engine room crew,9
Newfoundland and Labrador,    High Wage,Anglo Eastern Ship Managment Ltd,"Wanchai, A0A0A0",7612-Other trades helpers and labourers,1
Newfoundland and Labrador,    High Wage,Bailey Veterinary Surgical Specialty Ltd.,"St. John's, A1N3J7",3114-Veterinarians,1
Newfoundland and Labrador,    High Wage,Eastern Regional Health Authority,"Mount Pearl, A1N3J5",3111-Specialist physicians,1
Newfoundland and Labrador,    High Wage,WesTower Communications Ltd.,"St. John's, A1A5G6",7245-Telecommunications line and cable workers,2

[....]

如上面所示，第一行应该算是一个标题，第二行是CSV文件的列字段，以逗号间隔.

In [34]: import pandas as pd
# 这里跳过了第一行的读取,也可以使用如 skiprows=1,nrows=8814，从第二开始读取8814行.
In [35]: a = pd.read_csv("/home/michael/Documents/TFWP_2020Q1_Positive_EN.csv",encoding="latin",skiprows=1)
In [37]: a.head()
Out[37]:
          Province/Territory Program Stream                         Employer           Address                                      Occupation  Approved Positions
0  Newfoundland and Labrador      High Wage  Anglo Eastern Ship Managment Ltd  Wanchai, A0A0A0             2273-Deck officers, water transport                 4.0
1  Newfoundland and Labrador      High Wage  Anglo Eastern Ship Managment Ltd  Wanchai, A0A0A0         2274-Engineer officers, water transport                 4.0
2  Newfoundland and Labrador      High Wage  Anglo Eastern Ship Managment Ltd  Wanchai, A0A0A0                    7242-Industrial electricians                 1.0
3  Newfoundland and Labrador      High Wage  Anglo Eastern Ship Managment Ltd  Wanchai, A0A0A0  7532-Water transport deck and engine room crew                 9.0
4  Newfoundland and Labrador      High Wage  Anglo Eastern Ship Managment Ltd  Wanchai, A0A0A0         7612-Other trades helpers and labourers                 1.0

# 如：直接输出成HTML或EXCEL.
In [38]: a.to_html("name.html")

In [39]: a.to_excel("name.xlsx")

过滤一些特定字段，如下面，查找出Occupation字段中，以2175-Web开头的所有行记录.

a[a.Occupation.str.startswith('2175-Web')]
Out[85]:
     Province/Territory         Program Stream                                        Employer                   Address                         Occupation  Approved Positions
53          Nova Scotia              High Wage                              10094277 Canada Inc          Halifax, B3J2T9  2175-Web designers and developers                   2
124         Nova Scotia   Global Talent Stream                             3rDi Laboratory Inc.        Wolfville, B4P3R6  2175-Web designers and developers                   2
207              Quebec              High Wage                         213A Studio Créatif Inc.         Montréal, H2S3X3  2175-Web designers and developers                   1
249              Quebec              High Wage                            9122-4790 Québec Inc.            Laval, H7M5Y6  2175-Web designers and developers                   1
511              Quebec              High Wage                                     Géoplus Inc.            Laval, H7L5B7  2175-Web designers and developers                   1
609              Quebec              High Wage       les produits de fenetres sol-r (2000) inc.         montreal, H4N1H8  2175-Web designers and developers                   1
668              Quebec              High Wage                                      Ossiaco Inc         Montreal, H3C2G9  2175-Web designers and developers                   1

错误处理

In [2]: pd.read_csv("/home/michael/Documents/TFWP_2020Q1_Positive_EN.csv")
[.....]
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 1: invalid continuation byte

尝试使用encoding="latin"读取，如:pd.read_csv("/home/michael/Documents/TFWP_2020Q1_Positive_EN.csv",encoding="latin")

谢谢支持

微信二维码:

DoH服务

NextCloud

Docker-compose搭建nextcloud+collabora/code(无HTTPS)

Self-Signed HTTPS版本

Docker安装

Dokku安装

安装Plugins

安装

影音视频

jellyfin媒体中心

KODI(XMBC)

本地收藏夹服务

PDF转文本OCR

poppler-utils

ocrmypdf

Tesseract

Frog

离线wiki

Kiwix

搭建Matrix服务

生成Synapse配置文件

caddy V2代理

移动APP的连接

MinIO

容器运行

直接单机运行

客户端访问

mc

awscli

Syncthing

使用docker-compose创建

源码编译，安装systemd服务运行

使用docker-compose安装mysql+phpadmin

协作文档

CryptPad

wiki.js

docker-compose安装

配置TLS与域名

LoRaWAN 网关

ChirpStack

系统应用安装方式

先安装基础服务

安装ChirpStack Gateway Bridge.

安装ChirpStack

Docker安装方式

注册LoraWAN gateway

AI/ML

语音控制

DeepSpeech

MycroftAI/mimic3

(MycroftAI/mycroft-core)[https://github.com/MycroftAI/mycroft-core]

通信类

Matrix

服务端

Dendrite

客户端

项目管理

OpenProject

自动化测试

RobotFramework

Trojan

资源聚合

谢谢支持

数据卷(Volumes)

本地数据卷

HostPath

Ceph集群

概要

数据卷

Ceph的功能组件

Ceph 功能特性

RADOS

Ceph文件系统

Ceph块设备

Ceph对象网关

通过Ceph/ceph-ansiable安装

通过Ceph/ceph-deploy安装

快速安装(apt)

清理旧节点

安装节点

`Docker-compose`搭建`nextcloud+collabora/code`(无HTTPS)

`Self-Signed HTTPS`版本

安装`Plugins`

`jellyfin`媒体中心

搭建`Matrix`服务

生成`Synapse`配置文件

`caddy V2`代理

使用`docker-compose`创建

源码编译，安装`systemd`服务运行

使用`docker-compose`安装`mysql+phpadmin`

`docker-compose`安装

配置`TLS`与域名

安装`ChirpStack Gateway Bridge`.

安装`ChirpStack`

注册`LoraWAN gateway`

`HostPath`

`Ceph`集群

`Ceph`的功能组件

`Ceph`文件系统

`Ceph`块设备

`Ceph`对象网关

通过`Ceph/ceph-ansiable`安装

通过`Ceph/ceph-deploy`安装

`Ceph Manager`部署

`Ceph OSD`部署

`Parted`(GPT 分区)

创建`MDS`服务器

`civetweb`配置

创建`S3`用户

创建`Swift`用户

使用`Python`客户端库测试

使用`s3cmd`测试

`s3fs-fuse`挂载文件系统

与`Kubernetes`集成

创建`RBD`

`Ceph-Ansible`安装方式

安装`Kubernetes`Master

使用`Rook`构建

安装`Ceph`

拆除`ROOK`

`PaaS`概述

`Kubernetes`概述

`Kubernetes`的基本概念

`Pod`

`Replication Controller`

`Deployment`

`Job`

`StatefulSet`

`Service`

`Ingress`

`Label`

`Node`

`Kubernetes`架构

`Master`节点

`API Server(kube-apiserver)`

`Scheduler(kube-scheduler)`

`Controller Manager(kube-controller-manager)`

`etcd`

`Pod`网络

`Node`节点

`kubelet`

`kube-proxy`

`Secret & ConfigMap`

安装`Minikube`

源码编译`Minikube`

手动布署安装`Kubernetes`组件

`Debian,Ubuntu`发行版安装`kubeadm`

安装`Master`节点

修改`Kubelet`的启动参数

拆除`k8s`集群

安装`HelloWorld`共享`Pod`数据

通过`Github`安装`Kubernetes`

`Etcd`

`Kubernetes`发布包安装

安装`Minio`服务(单节点服务)