I have setup the docker container to run the model evaluation.
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
To run `find` and process PDF files with `process.py` without overwriting existing output files and changing the output extension correctly, use this command:
1 2 3 4 5 | mkdir -p txt && find . -name "*.pdf" -exec sh -c ' input="$0" output="txt/$(basename "$input" .pdf).txt" [ -f "$output" ] || python3 process.py "$input" "$output" ' {} \; |
I extracted the following code from that output:
1 2 3 4 5 | mkdir -p txt && find . -name "*.pdf" -exec sh -c ' input="$0" output="txt/$(basename "$input" .pdf).txt" [ -f "$output" ] || python3 process.py "$input" "$output" ' {} \; |
Running the following program:
1 2 3 4 5 | mkdir -p txt && find . -name "*.pdf" -exec sh -c ' input="$0" output="txt/$(basename "$input" .pdf).txt" [ -f "$output" ] || python3 process.py "$input" "$output" ' {} \; |
final: True Resulting in output: True