Work tool and command tricks 1.0
Shell命令相关,python命令相关,pandas相关,git相关。
Shell tricks
查找
grep '[symbols]' [file]
grep -h '[symbols]' [files] #多文件时不输出filename
计行数
wc -l
分割统计
awk -F "," '{print $1}' file.txt #用逗号分隔,第二个
awk -F "," '{ print $4 "\t" $5}' file.txt #用逗号分隔,第四个\t第五个
挂起job
nohup python test.py > log.txt & #挂起一个python job,并将stdout仅输出到log.txt中
分割文件
split -l 10000 file.txt #按行数分割,每10K行一个文件
split -b 500M file.txt #按size分割,每500M一个文件
字符串批量操作
sed -i "s/abcd/abce/g" *.txt # 将txt文件中的abcd替换为abce
Python tricks
遍历目录下全部文件,处理后并重命名
rawdir = '/path/to/dir'
for root,dirs,files in os.walk(rawdir):
for f in files:
source_file_path = os.path.join(root,f)
target_file_path = source_file_path.replace('source','target')
with open(source_file_path,'r') as fin:
with open(target_file_path,'w') as fout:
fout.write(func(fin))
Hash脱敏
#default hash
hash(obj) # 输出为long,可以用hex()转换为十六进制str
#md5 hash
import hashlib
hashlib.md5(obj).hexdigest() #输出为str
URL decode
#python2
import urllib
name = 'abcd' # make sure name is a str, not a unicode
name = urllib.unquote(name)
#python3
import urllib.parse
name = 'abcd'
name = urllib.parse.unquote(name)
Pandas tricks
类别统计个数
df['col'].value_counts()
Git tricks
查看branch
git branch
切换branch
git checkout [branch]
创建branch
git checkout [branch] #切换到要创建的branch的父节点
git checkout -b [new branch]
删除 local branch
git checkout [other branch]
git branch -d [branch]
删除远程branch
git push origin --delete [branch]
推local branch到远程
git push --set-upstream origin [branch]
压缩commit
git rebase -i HEAD~n #n is the last n commits to compress
#然后把除latest的commit pick -> squash
#保存退出
同步master
git pull origin master
解决冲突
git mergetool
查看当前change
git diff
git diff [branch] #查看当前change与branch之间的diff
git diff [branch1] [branch2] #查看branch1与branch2之间的diff
查看commit的change
git show HEAD~0 #最新一个的diff
land code流程
# Start a new change set. Should branch from master HEAD.
$ git checkout master
$ git pull
$ git checkout -b <branch-name>
# Edit and commit your change to your local git.
$ git add <files you want to commit>
$ git commit # Important: DO NOT include "--amend" for the first time.
# More edit and commit. Subsequent commits SHOULD use "--amend"
# to merge all your changes into one atomic commit.
$ git add <files you want to commit>
$ git commit --amend
# When ready, send for code review. Run git pull to update your code to HEAD.
# --rebase means to apply your change on top of HEAD.
# Add the reviewer github name (e.g. "yanghuachu") under reviewers
# Add "datavisor-fte" under subscribers (to cc everyone) if needed.
$ git pull --rebase origin master
# After you resolve the conflicts (if any), you are ready to send to our diff server.
$ arc diff
Written on August 1, 2019