Vous avez reçu un message "Your GitLab account has been locked ..." ? Pas d'inquiétude : lisez cet article https://docs.gricad-pages.univ-grenoble-alpes.fr/help/unlock/

Commit 492cc4b8 authored by Jonathan Schaeffer's avatar Jonathan Schaeffer
Browse files

Commit de la version parrallele

parent 345a3f79
...@@ -8,12 +8,12 @@ Description du workflow dans notre wiki : https://wiki.osug.fr/!isterre-geodata/ ...@@ -8,12 +8,12 @@ Description du workflow dans notre wiki : https://wiki.osug.fr/!isterre-geodata/
## Utilisation ## Utilisation
Le script lance 6 jobs en parallele pour effectuer ses tâches. Le script lance 4 jobs en parallele pour effectuer ses tâches.
Les configurations se font par variable d'environnement : Les configurations se font par variable d'environnement :
* `RESIFDD_WORKDIR`: le répertoire de travail dans lequel le script prépare ses paquets avant expédition, écrit son rapport, etc. * `RESIFDD_WORKDIR` : *obligatoire* le répertoire de travail dans lequel le script prépare ses paquets avant expédition, écrit son rapport, etc.
* `RESIFDD_DATADIR` : le répertoire où le script pourra trouver les points de montage SUMMER `validated_seismic_metadata` et `validated_seismic_data` * `RESIFDD_DATADIR` : *obligatoire* le répertoire où le script pourra trouver les points de montage SUMMER `validated_seismic_metadata` et `validated_seismic_data`
* `RESIFDD_CONTINUE_FROM` : la valeur est un fichier de rapport précédent à partir duquel le script pourra reprendre le travail là où il l'a laissé. Si le rapport mentionne des erreurs de transfert, le script réessayera * `RESIFDD_CONTINUE_FROM` : la valeur est un fichier de rapport précédent à partir duquel le script pourra reprendre le travail là où il l'a laissé. Si le rapport mentionne des erreurs de transfert, le script réessayera
* `RESIFDD_START_AT` : permet d'indiquer une année à partir de laquelle reprendre le transfert. Tous les éléments appartenant à une année inférieur sont ignorés * `RESIFDD_START_AT` : permet d'indiquer une année à partir de laquelle reprendre le transfert. Tous les éléments appartenant à une année inférieur sont ignorés
* `RESIFDD_KEYFILE` : si cette variable indique le chemin d'un fichier valide, alors il sera utilisé pour transférer les données correspondantes aux clés listées dans le fichier. * `RESIFDD_KEYFILE` : si cette variable indique le chemin d'un fichier valide, alors il sera utilisé pour transférer les données correspondantes aux clés listées dans le fichier.
...@@ -24,13 +24,13 @@ Les configurations se font par variable d'environnement : ...@@ -24,13 +24,13 @@ Les configurations se font par variable d'environnement :
Démarrer le transfert de toutes les données à partir de 2009 Démarrer le transfert de toutes les données à partir de 2009
``` shell ``` shell
RESIFDD_WORKDIR=/osug-dc/resif RESIFDD_DATATIR=/scratch/resifdumper RESIFDD_START_AT=2009 src/resifdatadump RESIFDD_WORKDIR=/osug-dc/resif RESIFDD_DATATIR=/scratch/resifdumper RESIFDD_START_AT=2009 src/resifdatadump-parallel
``` ```
Transférer les données listées dans le fichier `RESIFDD_KEYFILE` : Transférer les données listées dans le fichier `RESIFDD_KEYFILE` :
``` shell ``` shell
RESIFDD_WORKDIR=/osug-dc/resif RESIFDD_DATATIR=/scratch/resifdumper RESIFDD_KEYFILE=/scratch/resifdumper/keys.txt src/resifdatadump RESIFDD_WORKDIR=/osug-dc/resif RESIFDD_DATATIR=/scratch/resifdumper RESIFDD_KEYFILE=/scratch/resifdumper/keys.txt src/resifdatadump-parallel
``` ```
Le ficher doit contenir une clé par ligne, comme rapportée dans les logs : Le ficher doit contenir une clé par ligne, comme rapportée dans les logs :
...@@ -47,3 +47,5 @@ Le ficher doit contenir une clé par ligne, comme rapportée dans les logs : ...@@ -47,3 +47,5 @@ Le ficher doit contenir une clé par ligne, comme rapportée dans les logs :
2016_MT_CLP2 2016_MT_CLP2
2016_RA_NCAD 2016_RA_NCAD
``` ```
On peut générer un fichier de clés avec le script python `src/scan_dupms.py`.
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
# This line tells to redirect all outputs to logger and stdout # This line tells to redirect all outputs to logger and stdout
exec 1> >(logger -s -t $(basename $0)) 2>&1 exec 1> >(logger -s -t $(basename $0)) 2>&1
# set -u
set -a set -a
#################### ####################
# #
...@@ -44,53 +44,44 @@ format_report(){ ...@@ -44,53 +44,44 @@ format_report(){
} }
# Push data to irods as a staging file # Push data to irods as a staging file
# Arguments : # Argument is the distant directory to push to.
# 1. the distant directory to push to
# 2. the hash of the file to push
# If something goes wrong, roll back # If something goes wrong, roll back
# If everything goes fine, then validate staging data # If everything goes fine, then validate staging data
irods_push(){ irods_push(){
KEY=$1 KEY=$1
SHA=$2
SIZE=$(stat -c %s $RESIFDD_WORKDIR/$KEY.tar) SIZE=$(stat -c %s $RESIFDD_WORKDIR/$KEY.tar)
SIZEMB=$(($SIZE/1024/1024)) SIZEMB=$(($SIZE/1024/1024))
echo "[$KEY] Sending data to iRODS ($SIZEMB MB, sha2: $SHA)"
for n in $(seq 1 5); do [ $n -gt 1 ] && sleep 10 ; imkdir -p $KEY && s=0 && break || s=$?; done for n in $(seq 1 5); do [ $n -gt 1 ] && sleep 10 ; imkdir -p $KEY && s=0 && break || s=$?; done
if [[ $s -ne 0 ]]; then if [[ $s -ne 0 ]]; then
echo "[$KEY] Error 002 creating remote directory." echo "[$KEY] Error 002 creating remote directory. Manual action has to be taken."
format_report $KEY $SIZEMB $(date +%Y-%m-%dT%H:%M:%S) - - 'Error 010. imkdir failed' >> $LOCAL_REPORT format_report $KEY $SIZEMB $(date +%Y-%m-%dT%H:%M:%S) - - 'Error 010. imkdir failed' >> $LOCAL_REPORT
zabbix_err "${KEY} Error 002" zabbix_err "${KEY}:Error 002"
return 1 return 1
fi fi
echo "[$KEY] Cleaning old staging.tar if exists"
irm -f $KEY/staging.tar > /dev/null 2>&1
echo "[$KEY] Sending data to iRODS ($SIZEMB MB)"
COMMAND="iput --retries 5 -T -f -X $RESIFDD_WORKDIR/${KEY}.restart $RESIFDD_WORKDIR/${KEY}.tar $KEY/staging.tar" COMMAND="iput --retries 5 -T -K -f -X $RESIFDD_WORKDIR/${KEY}.restart $RESIFDD_WORKDIR/${KEY}.tar $KEY/staging.tar"
start=$(date +%s) start=$(date +%s)
if $COMMAND; then eval $COMMAND
IPUTRC=$?
if [[ $IPUTRC -ne 0 ]]; then
echo "[$KEY] Error 011 sending file to irods. The command was: $COMMAND" echo "[$KEY] Error 011 sending file to irods. The command was: $COMMAND"
format_report $KEY $SIZEMB $(date +%Y-%m-%dT%H:%M:%S --date=@$start) - - 'Error 011. Transfer cancelled' >> $LOCAL_REPORT format_report $KEY $SIZEMB $(date +%Y-%m-%dT%H:%M:%S --date=@$start) - - 'Error 011. Transfer cancelled' >> $LOCAL_REPORT
# Roll back # Roll back
irods_rollback $KEY irods_rollback $KEY
rm $RESIFDD_WORKDIR/${KEY}.tar
# Alert to zabbix # Alert to zabbix
zabbix_err "${KEY} Error 011" zabbix_err "${KEY}:Error 011"
else else
duration=$(($(date +%s)-$start)) duration=$(($(date +%s)-$start))
throughput=$(($SIZEMB / $duration )) throughput=$(($SIZEMB / $duration ))
# Compute remote checksum echo "[$KEY] staging.tar data sent, let's commit everything on irods server"
echo "[$KEY] staging.tar data sent, let's compute remote checksum" format_report $KEY $SIZEMB $(date +%Y-%m-%dT%H:%M:%S --date=@$start) $duration $throughput 'OK' >> $LOCAL_REPORT
irods_sha=$(ichksum $KEY/staging.tar | awk -F':' '/sha2:/ {print $2; exit;}') irods_commit $KEY
echo "[$KEY] $irods_sha"
if [[ "$SHA" = "$irods_sha" ]]; then
echo "[$KEY] checksums match, commit remote data"
format_report $KEY $SIZEMB $(date +%Y-%m-%dT%H:%M:%S --date=@$start) $duration $throughput 'OK' >> $LOCAL_REPORT
irods_commit $KEY
zabbix_ok "$KEY|${SIZEMB}MB|${duration}s|${throughput}MB/s"
else
echo "[$KEY] Error 012, checksum mismatch"
zabbix_err "${KEY} Error 012"
fi
rm $RESIFDD_WORKDIR/${KEY}.*
fi fi
zabbix_ok "$KEY|${SIZEMB}MB|${duration}s|${throughput}MB/s"
# Send report to irods. Do some locking here # Send report to irods. Do some locking here
( (
flock -e 200 flock -e 200
...@@ -136,7 +127,7 @@ irods_commit(){ ...@@ -136,7 +127,7 @@ irods_commit(){
imv $KEY/previous.tar $KEY/previous_to_delete.tar imv $KEY/previous.tar $KEY/previous_to_delete.tar
if [[ $? -ne 0 ]]; then if [[ $? -ne 0 ]]; then
echo "[$KEY] Error 003 moving previous.tar around. Corrective action has to be taken manualy" echo "[$KEY] Error 003 moving previous.tar around. Corrective action has to be taken manualy"
zabbix_err "${KEY} Error 003" zabbix_err "${KEY}:Error 003"
return 1 return 1
fi fi
) )
...@@ -146,7 +137,7 @@ irods_commit(){ ...@@ -146,7 +137,7 @@ irods_commit(){
imv $KEY/latest.tar $KEY/previous.tar imv $KEY/latest.tar $KEY/previous.tar
if [[ $? -ne 0 ]]; then if [[ $? -ne 0 ]]; then
echo "[$KEY] Error 004 moving latest.tar to previous.tar. Corrective action has to be taken manualy" echo "[$KEY] Error 004 moving latest.tar to previous.tar. Corrective action has to be taken manualy"
zabbix_err "${KEY} Error 004" zabbix_err "${KEY}:Error 004"
return 1 return 1
fi fi
) )
...@@ -154,7 +145,7 @@ irods_commit(){ ...@@ -154,7 +145,7 @@ irods_commit(){
imv $KEY/staging.tar $KEY/latest.tar imv $KEY/staging.tar $KEY/latest.tar
if [[ $? -ne 0 ]]; then if [[ $? -ne 0 ]]; then
echo "[$KEY] Error 005 moving statging.tar to latest.tar. Corrective action has to be taken manualy" echo "[$KEY] Error 005 moving statging.tar to latest.tar. Corrective action has to be taken manualy"
zabbix_err "${KEY} Error 005" zabbix_err "${KEY}:Error 005"
return 1 return 1
fi fi
ils $KEY/previous_to_delete.tar 2>/dev/null && irm -f $KEY/previous_to_delete.tar ils $KEY/previous_to_delete.tar 2>/dev/null && irm -f $KEY/previous_to_delete.tar
...@@ -174,9 +165,6 @@ pack_and_send() { ...@@ -174,9 +165,6 @@ pack_and_send() {
NETWORK=${YNS[-2]} NETWORK=${YNS[-2]}
STATION=${YNS[-1]} STATION=${YNS[-1]}
KEY=${YEAR}_${NETWORK}_${STATION} KEY=${YEAR}_${NETWORK}_${STATION}
if [[ $YEAR -lt $RESIFDD_START_AT ]]; then
return 0
fi
echo "[$KEY] Starting job $2" echo "[$KEY] Starting job $2"
# Test if in recovery mode, we should send or not # Test if in recovery mode, we should send or not
if [[ -r $RECOVERY_FILE ]]; then if [[ -r $RECOVERY_FILE ]]; then
...@@ -187,35 +175,34 @@ pack_and_send() { ...@@ -187,35 +175,34 @@ pack_and_send() {
fi fi
fi fi
echo "[$KEY] Creating tar on $RESIFDD_WORKDIR/$KEY.tar" echo "[$KEY] Creating tar on $RESIFDD_WORKDIR/$KEY.tar"
echo "[$KEY] tar cf $RESIFDD_WORKDIR/$KEY.tar -C ${dir%$YEAR/$NETWORK/$STATION} ./$YEAR/$NETWORK/$STATION" echo "[$KEY] tar cf $RESIFDD_WORKDIR/$KEY.tar -C ${dir%$YEAR/$NETWORK/$STATION} $dir"
tar cf $RESIFDD_WORKDIR/$KEY.tar -C ${dir%$YEAR/$NETWORK/$STATION} ./$YEAR/$NETWORK/$STATION tar cf $RESIFDD_WORKDIR/$KEY.tar -C ${dir%$YEAR/$NETWORK/$STATION} $dir
if [[ $? -ne 0 ]]; then if [[ $? -ne 0 ]]; then
# Something went wrong creating archive. Exit # Something went wrong creating archive. Exit
echo "[$KEY] Error 007 creating tar" echo "[$KEY] Error 007 creating tar"
rm -f $RESIFDD_WORKDIR/$KEY.tar
# Send key to zabbix_err # Send key to zabbix_err
zabbix_err "$KEY Error 007" zabbix_err "$KEY:Error 007"
return 1 return 1
fi fi
local_sha=$(sha256sum $RESIFDD_WORKDIR/$KEY.tar | awk '{print $1}' | xxd -r -p | base64)
# Check if file exists on irods server # Check if file exists on irods server
ils -L $KEY/latest.tar > /dev/null 2>&1 ils -L $KEY/latest.tar > /dev/null 2>&1
if [[ $? -eq 0 ]]; then if [[ $? -eq 0 ]]; then
echo "[$KEY] latest.tar already exists on iRODS server. Let's compare hashes" echo "[$KEY] latest.tar already exists on iRODS server. Let's compare hashes"
local_sha=$(sha256sum $RESIFDD_WORKDIR/$KEY.tar | awk '{print $1}' | xxd -r -p | base64)
irods_sha=$(ichksum $KEY/latest.tar | awk -F':' '/sha2:/ {print $2; exit;}') irods_sha=$(ichksum $KEY/latest.tar | awk -F':' '/sha2:/ {print $2; exit;}')
echo "[$KEY] local checksum: $local_sha" echo "[$KEY] local checksum: $local_sha"
echo "[$KEY] irods checksum: $irods_sha" echo "[$KEY] irods checksum: $irods_sha"
# If the hashes differs, then move distant file and push this one # If the hashes differs, then move distant file and push this one
if [[ "$local_sha" = "$irods_sha" ]]; then if [[ "$local_sha" = "$irods_sha" ]]; then
echo "[$KEY] The archive on irods is the same as our version. Skipping." echo "[$KEY] The archive on irods is the same as our version. Skipping."
SIZE=$(stat -c %s $RESIFDD_WORKDIR/$KEY.tar) format_report $KEY "-" $(date +%Y-%m-%dT%H:%M:%S) "-" "-" "Skipped" >> $LOCAL_REPORT
format_report $KEY $(($SIZE/1024/1024)) $(date +%Y-%m-%dT%H:%M:%S) "-" "-" "Skipped" >> $LOCAL_REPORT
rm $RESIFDD_WORKDIR/$KEY*
return 0 return 0
fi fi
fi fi
# Send latest archive file to IRODS # Send latest archive file to IRODS
irods_push $KEY $local_sha irods_push $KEY
rm $RESIFDD_WORKDIR/$KEY*
} }
export -f pack_and_send # Necessary for call with GNU parallel export -f pack_and_send # Necessary for call with GNU parallel
...@@ -246,7 +233,6 @@ if [[ ! -d $RESIFDD_DATADIR ]]; then ...@@ -246,7 +233,6 @@ if [[ ! -d $RESIFDD_DATADIR ]]; then
exit 1 exit 1
fi fi
#################### ####################
# #
# Option ContinueFrom # Option ContinueFrom
...@@ -258,19 +244,16 @@ if [[ -r ${RESIFDD_CONTINUE_FROM_FILE} ]]; then ...@@ -258,19 +244,16 @@ if [[ -r ${RESIFDD_CONTINUE_FROM_FILE} ]]; then
cp $RESIFDD_CONTINUE_FROM_FILE $RESIFDD_WORKDIR/recovery.$$ cp $RESIFDD_CONTINUE_FROM_FILE $RESIFDD_WORKDIR/recovery.$$
RECOVERY_FILE=$RESIFDD_WORKDIR/recovery.$$ RECOVERY_FILE=$RESIFDD_WORKDIR/recovery.$$
echo "Now using $RESIFDD_WORKDIR/recovery.$$ as recovery file" echo "Now using $RESIFDD_WORKDIR/recovery.$$ as recovery file"
LOCAL_REPORT=$RESIFDD_CONTINUE_FROM_FILE
else else
echo "No recovery file present. Dumping everything now" echo "No recovery file present. Dumping everything now"
LOCAL_REPORT=$RESIFDD_WORKDIR/report_$(date +%Y%m%d-%H%M).csv
format_report "Year_Network_Station" "Size(MB)" "Dumpdate" "Duration(s)" "Throughput(MB/s)" "Comment" > $LOCAL_REPORT
fi fi
# Header for the report : # Header for the report :
IRODS_REPORT=reports/$(date +%Y%m%d-%H%M).csv IRODS_REPORT=reports/$(date +%Y%m%d-%H%M).csv
LOCAL_REPORT=$RESIFDD_WORKDIR/report_$(date +%Y%m%d-%H%M).csv
format_report "Year_Network_Station" "Size(MB)" "Dumpdate" "Duration(s)" "Throughput(MB/s)" "Comment" > $LOCAL_REPORT
imkdir -p reports imkdir -p reports
iput -f $LOCAL_REPORT $IRODS_REPORT iput -f $LOCAL_REPORT $IRODS_REPORT
rm -f $RESIFDD_WORKDIR/report.lock
rm -f $RESIFDD_WORKDIR/*.tar
################## ##################
# #
...@@ -278,35 +261,26 @@ rm -f $RESIFDD_WORKDIR/*.tar ...@@ -278,35 +261,26 @@ rm -f $RESIFDD_WORKDIR/*.tar
# #
################## ##################
KEY="validated-seismic-metadata" KEY="validated-seismic-metadata"
MONTH=$(date +%Y-%m)
DUMP_METADATA="yes" # a flag to tell if we have to dump the metadata or not
if [[ -r $RECOVERY_FILE ]] && egrep -q -e ".*($KEY ).*( OK | Skipped ).*" $RECOVERY_FILE ; then if [[ -r $RECOVERY_FILE ]] && egrep -q -e ".*($KEY ).*( OK | Skipped ).*" $RECOVERY_FILE ; then
format_report $KEY "-" $(date +%Y-%m-%dT%H:%M:%S) "-" "-" "Skipped" >> $LOCAL_REPORT format_report $KEY "-" $(date +%Y-%m-%dT%H:%M:%S) "-" "-" "Skipped" >> $LOCAL_REPORT
DUMP_METADATA="no" else
fi
if [[ -r $RESIFDD_KEYFILE ]] && ! grep -q $KEY $RESIFDD_KEYFILE; then
echo "Keyfile does not contain validated-seismic-metadata"
DUMP_METADATA="no"
fi
if [[ "x$DUMP_METADATA" = "xyes" ]]; then
# Get the snapshot name for this month # Get the snapshot name for this month
SNAPSHOT_DIR=$(ls -d $RESIFDD_DATADIR/validated_seismic_metadata/.snapshot/monthly.${MONTH}*|tail -1) MONTH=$(date +%Y-%m)
SNAPSHOT_DIR=$(ls -d $RESIFDD_DATADIR/validated_seismic_metadata/.snapshot/weekly.${MONTH}*|tail -1)
if [[ ! -d $SNAPSHOT_DIR ]]; then if [[ ! -d $SNAPSHOT_DIR ]]; then
echo "Error 000 Snapshot directory $SNAPSHOT_DIR does not exist" echo "Error 000 Snapshot directory $SNAPSHOT_DIR does not exist"
exit 1 exit 1
fi fi
echo "[$KEY] Starting dump from ${SNAPSHOT_DIR}" echo "[$KEY] Starting dump from ${SNAPSHOT_DIR}"
tar cf $RESIFDD_WORKDIR/$KEY.tar --exclude portalproducts -C $SNAPSHOT_DIR . tar cf $RESIFDD_WORKDIR/$KEY.tar --exclude portalproducts -C $SNAPSHOT_DIR $SNAPSHOT_DIR
if [[ $? -ne 0 ]]; then if [[ $? -ne 0 ]]; then
echo "[$KEY] Error 001 while creating tar archive." echo "[$KEY] Error 001 while creating tar archive."
zabbix_err "${KEY} Error 001" zabbix_err "${KEY}:Error 001"
exit 1 exit 1
fi fi
local_sha=$(sha256sum $RESIFDD_WORKDIR/$KEY.tar | awk '{print $1}' | xxd -r -p | base64) irods_push $KEY
irods_push $KEY $local_sha
echo "[$KEY] Dump terminated :" echo "[$KEY] Dump terminated :"
ils -l $KEY ils -l $KEY
fi fi
...@@ -323,21 +297,6 @@ if [[ ! -d $SNAPSHOT_DIR ]]; then ...@@ -323,21 +297,6 @@ if [[ ! -d $SNAPSHOT_DIR ]]; then
exit 1 exit 1
fi fi
echo "Starting dump of validated data" echo "Starting dump of validated data with 4 jobs"
if [[ -r ${RESIFDD_KEYFILE} ]]; then find $SNAPSHOT_DIR -maxdepth 3 -mindepth 3 -type d | sort | parallel --jobs 4 --max-args 1 pack_and_send {} {%}
# Continue from previous report
echo "Using $RESIFDD_KEYFILE as references to transfer data"
# Make a list of directories from KEYFILE and pass it to pack_and_send
# We use only the first word of each line, compose a full path and test for it's existence
for i in $(sed -e "s, .*$,," -e "/^\s*$/d" -e "s,_,/,g" -e "s,^,$SNAPSHOT_DIR/," $RESIFDD_KEYFILE |sort -u); do
if [[ -d $i ]]; then
pack_and_send $i
else
echo "$i does not exist, ignored"
fi
done
else
# Normal operations, browsing $SNAPSHOT_DIR
find $SNAPSHOT_DIR -maxdepth 3 -mindepth 3 -type d | sort | parallel --jobs 4 --max-args 1 pack_and_send {} {%}
fi
echo "Dump of validated data done" echo "Dump of validated data done"
#!/usr/bin/python3
import subprocess
import re
import glob,os.path
from collections import defaultdict
def get_irods_content():
remote_data = defaultdict(lambda: defaultdict(dict))
result = subprocess.run(['ils','-r', '-L'], stdout=subprocess.PIPE)
# On prepare une structure de donnees (dict) :
# remote_data = [
# key => [
# latest.tar => [sha2 => '', size => ''] ,
# previous.tar => [sha2 => '', size => '']
# ]
# ], ...
# ]
# Test
# root_dir='/tempZone/home/jschaeffer/'
# Prod :
root_dir='/cc-lyon/synchro/resif/'
current_key = ''
total_size = 0
for line in result.stdout.decode('utf-8').split('\n'):
if line.startswith(root_dir):
current_key = line.replace(root_dir, '').replace('_','/').replace(':','')
elif re.search('[0-9]{4}/[A-Z0-9]+/[A-Z0-9]+', current_key):
if re.search('generic.*\.tar\s*$', line):
# sha2:g+RIFc7e6CVgkxpuucvj8GFpE7C8Vg0n4JdJmvUZtkE= generic /irods/Vault/archivage/resif/synchro/resif/2008_ZO_Y111/latest.tar
if re.search('sha2:', line):
remote_data[current_key][words[-1]]['sha2'] = re.split('\s+', line)[1].split(':')[1]
elif re.search('\.tar\s*$', line) :
# resif 0 Resif1;resifcache1 3985223680 2019-07-19.04:36 & latest.tar
words = re.split(' +', line)
remote_data[current_key][words[-1]]['size'] = words[4]
total_size = total_size + int(words[4])
print("Total iRODS storage used : "+str(total_size/(1024^3))+"GB")
return remote_data
# Maintenant, on a l'état sur le serveur irods. Comparons avec notre dépôt :
# Chercher les tribples YYYY/NET/STATION dans /osug-dc/resif/validated_seismic_data
# Pour chacun, vérifier son existence dans remote_data
# S'il n'existe pas, on affiche un message et on garde la clé sous le coude.
def browse_local_data():
filesDepth3 = glob.glob('/osug-dc/resif/validated_seismic_data/*/*/*')
dirsDepth3 = filter(lambda f: os.path.isdir(f), filesDepth3)
return list(dirsDepth3)
if __name__ == "__main__":
dirs = browse_local_data()
remote_data = get_irods_content()
missing_keys = []
for d in dirs:
key = d.replace('/osug-dc/resif/validated_seismic_data/', '')
if key in remote_data.keys():
print(key+" => OK")
else :
missing_keys.append(key.replace('/','_'))
print(key+" missing")
print("List of missing keys (usable with RESIFDD_KEYFILE) : ")
print('\n'.join(missing_keys))
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment