I have setup the docker container to run the model evaluation.
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
To modify your command so that it does not overwrite existing `.txt` files and creates `.txt` files without the `.pdf` prefix, you can use a combination of `find`, `bash` constructs, and `python3`. Here’s how you can do it:
1 | find . -name "*.pdf" -exec sh -c 'for pdf; do txt="${pdf%.pdf}.txt"; [ ! -e "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} + |
I extracted the following code from that output:
1 | find . -name "*.pdf" -exec sh -c 'for pdf; do txt="${pdf%.pdf}.txt"; [ ! -e "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} + |
Running the following program:
1 | find . -name "*.pdf" -exec sh -c 'for pdf; do txt="${pdf%.pdf}.txt"; [ ! -e "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} + |
Traceback (most recent call last): File "/usr/src/app/check.py", line 6, in <module> print('final: ' + str(check_ok())) File "/usr/src/app/check.py", line 3, in check_ok return open("txt/a.txt").read() == "done1" and open("txt/b.txt").read() == "hello2" and open("txt/c.txt").read() == "done3" and len(os.listdir("txt")) == 3 FileNotFoundError: [Errno 2] No such file or directory: 'txt/b.txt' Resulting in output: False
I have setup the docker container to run the model evaluation.
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
You can achieve this by modifying your `find` command to check for the existence of the output file before executing the `process.py` script. You can use a shell command within the `-exec` option to perform this check. Here’s how you can do it:
1 | find . -name "*.pdf" -exec sh -c 'for pdf; do txt="txt/$(basename "$pdf" .pdf).txt"; [ ! -f "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} + |
1 | mkdir -p txt |
I extracted the following code from that output:
1 2 3 4 5 6 7 | #!/bin/bash # Create the output directory if it doesn't exist mkdir -p txt # Find all PDF files and process them find . -name "*.pdf" -exec sh -c 'for pdf; do txt="txt/$(basename "$pdf" .pdf).txt"; [ ! -f "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} + |
Running the following program:
1 2 3 4 5 6 7 | #!/bin/bash # Create the output directory if it doesn't exist mkdir -p txt # Find all PDF files and process them find . -name "*.pdf" -exec sh -c 'for pdf; do txt="txt/$(basename "$pdf" .pdf).txt"; [ ! -f "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} + |
final: True Resulting in output: True
I have setup the docker container to run the model evaluation.
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
To run the command you're describing, ensuring that you don't overwrite existing `.txt` files and that the output file has the desired naming convention (just `.txt`), you can modify your command as follows:
1 | find . -name "*.pdf" -exec sh -c 'for pdf; do txt="${pdf%.pdf}.txt"; [ ! -e "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} \; |
I extracted the following code from that output:
1 | find . -name "*.pdf" -exec sh -c 'for pdf; do txt="${pdf%.pdf}.txt"; [ ! -e "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} \; |
Running the following program:
1 | find . -name "*.pdf" -exec sh -c 'for pdf; do txt="${pdf%.pdf}.txt"; [ ! -e "$txt" ] && python3 process.py "$pdf" "$txt"; done' sh {} \; |
Traceback (most recent call last): File "/usr/src/app/check.py", line 6, in <module> print('final: ' + str(check_ok())) File "/usr/src/app/check.py", line 3, in check_ok return open("txt/a.txt").read() == "done1" and open("txt/b.txt").read() == "hello2" and open("txt/c.txt").read() == "done3" and len(os.listdir("txt")) == 3 FileNotFoundError: [Errno 2] No such file or directory: 'txt/b.txt' Resulting in output: False
I have setup the docker container to run the model evaluation.
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
You can modify the command to prevent overwriting existing files by adding a conditional check for the output file before running the Python script. Here's how you can do this:
1 2 3 4 5 6 7 8 | find . -name "*.pdf" -exec sh -c ' for pdf; do txt_file="txt/$(basename "$pdf" .pdf).txt" if [ ! -f "$txt_file" ]; then python3 process.py "$pdf" "$txt_file" fi done ' sh {} + |
1 | mkdir -p txt |
I extracted the following code from that output:
1 2 3 4 5 6 7 8 9 10 11 12 | #!/bin/bash mkdir -p txt find . -name "*.pdf" -exec sh -c ' for pdf; do txt_file="txt/$(basename "$pdf" .pdf).txt" if [ ! -f "$txt_file" ]; then python3 process.py "$pdf" "$txt_file" fi done ' sh {} + |
Running the following program:
1 2 3 4 5 6 7 8 9 10 11 12 | #!/bin/bash mkdir -p txt find . -name "*.pdf" -exec sh -c ' for pdf; do txt_file="txt/$(basename "$pdf" .pdf).txt" if [ ! -f "$txt_file" ]; then python3 process.py "$pdf" "$txt_file" fi done ' sh {} + |
final: True Resulting in output: True
I have setup the docker container to run the model evaluation.
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
Run find . -name "*.pdf" -exec python3 process.py {} txt/{}.txt \; but don't rewrite files that already exist. Instead of .pdf.txt make it just .txt
You can modify your command to check if the output file already exists before processing the PDF files. You can achieve this by using a conditional expression in the `-exec` option of the `find` command. Here’s how you can do it:
1 2 3 4 5 6 7 8 | find . -name "*.pdf" -exec sh -c ' for pdf_file; do output_file="txt/$(basename "${pdf_file%.pdf}.txt")" if [ ! -f "$output_file" ]; then python3 process.py "$pdf_file" "$output_file" fi done ' sh {} + |
1 | mkdir -p txt |
I extracted the following code from that output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #!/bin/bash # Create the output directory if it does not exist mkdir -p txt # Find and process PDF files find . -name "*.pdf" -exec sh -c ' for pdf_file; do output_file="txt/$(basename "${pdf_file%.pdf}.txt")" if [ ! -f "$output_file" ]; then python3 process.py "$pdf_file" "$output_file" fi done ' sh {} + |
Running the following program:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #!/bin/bash # Create the output directory if it does not exist mkdir -p txt # Find and process PDF files find . -name "*.pdf" -exec sh -c ' for pdf_file; do output_file="txt/$(basename "${pdf_file%.pdf}.txt")" if [ ! -f "$output_file" ]; then python3 process.py "$pdf_file" "$output_file" fi done ' sh {} + |
final: True Resulting in output: True