linux:parallel_rsync
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
parallel_rsync [2012/12/06 17:10] – created dodger | linux:parallel_rsync [2022/02/11 11:36] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== [SCRIPT] psync (parallel rsync) ====== | ||
+ | |||
====== Description ====== | ====== Description ====== | ||
This set of scripts will parallelize the transfer of a huge directory tree keeping in mind a maximum number of simultaneous transfers. | This set of scripts will parallelize the transfer of a huge directory tree keeping in mind a maximum number of simultaneous transfers. | ||
====== Instructions ====== | ====== Instructions ====== | ||
+ | I suggest you tu launch psync with the following line: | ||
+ | <code bash> | ||
+ | ./psync.sh / | ||
+ | </ | ||
+ | Don't launch it with the FINAL SLASH: | ||
+ | * NOP: < | ||
+ | * YES: ./psync.sh / | ||
+ | |||
+ | |||
===== Pre-Reqs ===== | ===== Pre-Reqs ===== | ||
* gnu screen | * gnu screen | ||
Line 8: | Line 19: | ||
* ssh | * ssh | ||
- | ===== Headline | + | ===== psync.sh |
+ | ==== Description ==== | ||
+ | |||
+ | This script will: | ||
+ | * Check if the directory to transfer exists | ||
+ | * Calculate the directories to transfer at the maximum deep of // | ||
+ | * Parallel Transfer of the upper directories from deep 1 to deep // | ||
+ | * Parallel transfer of the directories at deep // | ||
+ | * Think that the // | ||
+ | |||
+ | ==== Code ==== | ||
+ | |||
+ | <file bash psync.sh> | ||
+ | # | ||
+ | [ ! $1 ] && echo " | ||
+ | |||
+ | TARGET=" | ||
+ | |||
+ | [[ ! " | ||
+ | [ ! -d ${TARGET} ] && echo -e " | ||
+ | |||
+ | LOGDIR=$(dirname $0)/ | ||
+ | [ -d ${LOGDIR} ] && echo " | ||
+ | mkdir -p ${LOGDIR}/ | ||
+ | |||
+ | check_max_processes() | ||
+ | { | ||
+ | local let MAXPARALEL=$1 | ||
+ | while [ $(ps waux | egrep ": | ||
+ | printf " | ||
+ | sleep 1 | ||
+ | done | ||
+ | } | ||
+ | |||
+ | sync_this() | ||
+ | { | ||
+ | local let MAXDEPTH=3 | ||
+ | local let MAXPARALEL=20 | ||
+ | |||
+ | LAUCHRSYNC=" | ||
+ | local let y=0 | ||
+ | for FOLDER in $(find ${TARGET} -mindepth ${MAXDEPTH} -maxdepth ${MAXDEPTH} -type d) ; do | ||
+ | DIRLIST[$y]=" | ||
+ | let y++ | ||
+ | done | ||
+ | |||
+ | echo " | ||
+ | for ((i=0; | ||
+ | let x=0 | ||
+ | for ITEM in $(find ${TARGET} -mindepth $i -maxdepth $i -type d) ; do | ||
+ | check_max_processes ${MAXPARALEL} | ||
+ | screen -S ${x} -d -m ${LAUCHRSYNC} -nr ${ITEM} nr_${x} ${LOGDIR} | ||
+ | let x++ | ||
+ | [[ $x =~ [0-9]{1, | ||
+ | done | ||
+ | echo "Deep $i DONE, going upper" | ||
+ | done | ||
+ | echo " | ||
+ | let x=0 | ||
+ | for ((i=0; | ||
+ | printf " | ||
+ | check_max_processes ${MAXPARALEL} | ||
+ | screen -S ${i} -d -m ${LAUCHRSYNC} -r ${DIRLIST[$i]} r_${i} ${LOGDIR} | ||
+ | done | ||
+ | } | ||
+ | |||
+ | sync_this ${TARGET} | ||
+ | </ | ||
+ | |||
+ | ==== Script Variables ==== | ||
+ | ^ Variable ^ Description ^ | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
+ | |||
+ | ===== launch_rsync.sh ===== | ||
+ | ==== Description ==== | ||
+ | This script will: | ||
+ | * Launch rsync non-parallel or parallel | ||
+ | * Log the exit code of rsync to know if everything gones fine or not | ||
+ | |||
+ | |||
+ | ==== Code ==== | ||
+ | <file bash launch_rsync.sh> | ||
+ | # | ||
+ | # launch_rsync.sh | ||
+ | RECURSIVE=$(echo $1 | tr ' | ||
+ | TARGET=$2 | ||
+ | SCREENNAME=$3 | ||
+ | LOGDIR=$4 | ||
+ | DSTSERVER=" | ||
+ | DESTINATION=" | ||
+ | if [[ " | ||
+ | rsync -cdlptgoDv --partial ${TARGET}/* ${DSTSERVER}: | ||
+ | RES=$? | ||
+ | elif [[ " | ||
+ | rsync -cazv --partial ${TARGET}/* ${DSTSERVER}: | ||
+ | RES=$? | ||
+ | else | ||
+ | echo "$0 -nr|-r|--non-recursive|--recursive" | ||
+ | exit 1 | ||
+ | fi | ||
+ | if [ $RES -eq 0 ] ; then | ||
+ | echo "$RES : ${TARGET}" | ||
+ | else | ||
+ | echo "$RES : ${TARGET}" | ||
+ | fi | ||
+ | </ | ||
+ | ==== Variables ==== | ||
+ | ^ Variable ^ Description ^ | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
- | Some days ago I was on the chance to transfer a huge directory. | ||
- | Huge means ~50TB with +1million files and a deep of only 6 folders under the parent one. | ||
- | As I must do that kind of transfer more than 10 times with the same amount of folders… I decided to implement some kind of parallel function which launch parallel rsync’s at a given deep of my choose. | ||
- | The ressult was that “pure bash” little script (the only dependency is “screen”)… You’ll notice that the main function “sync_this()” will run alone in your script only changing 2 or 3 variables ;-) |
linux/parallel_rsync.1354813815.txt.gz · Last modified: 2012/12/06 17:10 by dodger