4.3 KiB

Raw Blame History

CLI Cheatsheet

Drupal / Drush
Text Processing
Torrents
PDF Tools

Drupal / Drush

Note: These SQL queries target the Drupal 7 schema (node_type table). They won't work as-is on Drupal 8+.

List content types with node counts

drush sqlq 'select count(node.nid) as node_count, node_type.type
            from node
            inner join node_type on node.type = node_type.type
            group by node_type.type'

Filter results by keyword (e.g. "2014")

drush sqlq 'select count(node.nid) as node_count, node_type.type
            from node
            inner join node_type on node.type = node_type.type
            group by node_type.type' | grep 2014

Text Processing

Find and replace inside files (perl)

# Single file type in current directory
perl -pi -w -e 's/SEARCH_FOR/REPLACE_WITH/g;' *.txt

# Recursively across all file types
perl -pi -w -e 's/thex/robertsonlibrary/g;' **/*.*

Find and replace in file names

# Rename files matching a pattern, verbose output shows what changed
rename 's/livero/lives/g' **/*.* -v

<<<<<<< HEAD

Torrents

Download via magnet link (aria2c)

# aria2c is a lightweight multi-protocol download utility
aria2c -d ~/Downloads "magnetlink"

PDF Tools

OCR a PDF (ocrmypdf)

# Standard: optimize output, skip pages that already have a text layer
ocrmypdf --optimize 3 --skip-text input.pdf output.pdf

# Aggressive: force re-OCR even if a text layer exists (useful for corrupt/bad layers),
# set DPI manually, use page segmentation mode 1 (automatic with OSD)
ocrmypdf --optimize 3 --image-dpi 300 --output-type pdf \
         --force-ocr --tesseract-pagesegmode 1 input.pdf output.pdf

Downsample a PDF to 72dpi (Ghostscript)

# Single file
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 \
   -dPDFSETTINGS=/screen \
   -dNOPAUSE -dQUIET -dBATCH \
   -sOutputFile=output.pdf input.pdf

# Batch - processes all PDFs in current folder, preserves original filenames
mkdir -p downsampled
for f in *.pdf *.PDF; do
    [ -f "$f" ] || continue
    gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen \
       -dNOPAUSE -dBATCH -dQUIET \
       -sOutputFile="downsampled/$f" \
       "$f"
done

Check image DPI in PDFs (pdfimages)

# Print image info to terminal
for f in *.pdf *.PDF; do
    echo "=== $f ==="
    pdfimages -list "$f"
    echo ""
done

# Save output to images_list.txt instead
for f in *.pdf *.PDF; do
    [ -f "$f" ] || continue
    echo "=== $f ===" >> images_list.txt
    pdfimages -list "$f" >> images_list.txt
    echo "" >> images_list.txt
done

Scan for CCITT encoding

# CCITT is a fax-era compression format - flags PDFs that may cause compatibility issues
for f in *.pdf *.PDF; do
    [ -f "$f" ] || continue
    if pdfimages -list "$f" 2>/dev/null | grep -q " ccitt "; then
        echo "$f uses CCITT"
    fi
done

Audiobook commands

# cd to desired dir, create 'input' dir and put m4b file in it and run command
docker run --rm -it \
  -v "$(pwd)":/data \
  sandreas/m4b-tool \
  split "/data/input/The Haunting of Hill House.m4b" \
  --audio-format mp3 \
  --audio-bitrate 128k \
  --audio-channels 1 \
  --output-dir "/data/split"
  -vv

=======

check current dpi

for f in *.pdf *.PDF; do echo "=== Images in: $f ===" pdfimages -list "$f" echo "" done

Creates (or overwrites) images_list.txt in the current directory

for f in *.pdf *.PDF; do if [ -f "$f" ]; then echo "=== Images in: $f ===" >> images_list.txt pdfimages -list "$f" >> images_list.txt echo "" >> images_list.txt fi done

scan for ccitt encoding

for f in *.pdf *.PDF; do [ -f "$f" ] || continue if pdfimages -list "$f" 2>/dev/null | grep -q " ccitt "; then echo "$f uses CCITT" fi done

Large file transfer to gdrive

rclone copy file_location file_destination
--progress
--transfers 4
--checkers 8
--retries 10
--low-level-retries 20
--drive-chunk-size 64M
--log-file rclone.log
--log-level INFO

f99d1ed ([nb] Edit: cli_commands.md)

4.3 KiB Raw Blame History