Text

echo

To display text on standard output

echo "Path is $PATH"

To create a file with text

echo "foo" > file.txt

To append text at the end of an existing file

echo "bar" >> file.txt

tr

Replace some character by another

tr <FROM_SET> <TO_SET>

To replace all as by bs

echo aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa | tr a b
# bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb

To replace [a-c] (a,b,c) by d

echo abcd | tr a-c d
# ddddd

grep

Returns 0 if a match happens (read until match, stop afterwards)

echo Foobar | grep --max-count=1 --ignore-case foo | echo $?

cat (concatenate)

https://github.com/coreutils/coreutils/blob/master/src/cat.c

Original use

To produce a file containing both file content

cat file1.txt file2.txt > file3.txt

Hacks

To display a file content (showing special characters and line number)

cat --show-all --number README.startup.md

To edit a file interactively (hit Ctrl-D after keying in, you can key in several lines) Does this works ?

cat file.txt - > file.txt

To create a file and supply text interactively (hit Ctrl-D after keying in, you can key in several lines)

cat > file.txt

head

Show first lines of a huge file, or just the first

head --lines=1

tail

Show last lines of a huge file (or a fd)

tail $FILE

Show last lines of a file which is currently written

tail --follow --lines=10

split

Split one files into several files (per lines)

Take this half-million lines and split them into 500k chunks named chunk-<ID_DECIMAL>.csv

head -n 5000000 $RAW_DATA_FILE_PATH | split -d -l 100000 --additional-suffix=.csv - chunk

cut

cut pick fields from a list of rows, eg. CSV values

jane;doe;scientist
mary;shelley;writer

Pick the second occurence, eg. a folder in a path

echo 'a/b/c' | cut --delimiter=/ --fields=2
b

Strip the first occurence, eg. removing a leading folder in a path

echo 'a/b/c' | cut --delimiter=/ --fields=1 --complement
/b/c

diff

Show differences between two files.

diff $FILE_ONE $FILE_TWO

Show differences between two commands.

Show the files which are modified but not staged

diff <(git diff --name-only) <(git diff --name-only --staged)

comm

Show common lines of two files

Show only common fines: use parameters -1 -2

comm -1 -2 <(git diff --name-only $BACKEND_DIRECTORY) <(git diff --name-only --staged $BACKEND_DIRECTORY)

wc

Count words

awk

awk take text as input and transform it.

Basically, you can split a line into token when multiple whitespace occurs. Cut can handle a fixed whitespace, while awk can handle many.

echo "Hello world" | cut --delimiter=" " --fields=2
echo "Hello    world" | cut --delimiter=" " --fields=2
echo "Hello world" | awk '{print $2}'
echo "Hello    world" | awk '{print $2}'

world

world world

sed

uuid

uuidgen

It can be time-consuming:

10s for 10k
1 min 30 for 100k
30 minutes for 3 millions

time (1>/dev/null for i in {1..10000}; do uuidgen; done)
( for i in {1..10000}; do; uuidgen; done > /dev/null; )  5,69s user 2,95s system 99% cpu 8,670 total

Random or time-based does not matter much

> time (1>/dev/null for i in {1..10000}; do uuidgen --random; done)
( for i in {1..10000}; do; uuidgen --random; done > /dev/null; )  5,71s user 2,86s system 99% cpu 8,620 total
> time (1>/dev/null for i in {1..10000}; do uuidgen --time; done)
( for i in {1..10000}; do; uuidgen --time; done > /dev/null; )  5,84s user 3,03s system 92% cpu 9,621 total

pandoc

Convert from one format to another

pandoc --from <SOURCE_TYPE> --to <TARGET_TYPE> $TARGET_FILE>

From mediawiki to markdown, removing the original file.

FILE_NAME=<FILE>; pandoc --from mediawiki --to markdown $FILE_NAME.mediawiki > $FILE_NAME.startup.md; rm $FILE_NAME.mediawiki

Formats are supported (from and to):

Microsoft Word: docx
Markdown: markdown

Specific

JSON

Key-value / property

{
    "name": "Jane",
    "job" : "Database administrator"
}

jq

https://jqlang.github.io/jq/manual/

Check JSON

. is identify filter, use it to check file is valid JSON

> echo '[{"foo":"bar"},{"foobar":"barfoo"}]' | jq .
[
  {
    "foo": "bar"
  },
  {
    "foobar": "barfoo"
  }
]

Properties (value)

> echo '{"foo":"bar"}' | jq '.foo'
"baré

Arrays

https://jqlang.github.io/jq/manual/#basic-filters

Get value of first element's key

> echo '[{"foo":"bar"},{"foobar":"barfoo"}]' | jq '.[0].foo'
"bar"

Get value of all element's key

> echo '[{"foo":"foo"},{"foo":"bar"}]' | jq '.[].foo'
"foo"
"bar"

Get several values of all element's key

echo '{ "team" : [{"name": "Jane","job" : "Database administrator"},{"name": "Mattie","job" : "Back-end developer"}] }' | jq '.team | keys[] as $k | [$k, (.[$k] .name), (.[$k] .job)]'

[
  0,
  "Jane",
  "Database administrator"
]
[
  1,
  "Mattie",
  "Back-end developer"
]

https://www.markhneedham.com/blog/2021/05/19/jq-select-multiple-keys/

CSV

Miller

https://miller.readthedocs.io

mlr --csv cat $FILENAME

YAML

Dataset /tmp/team.yml

members:
  - name: Jane
    job: Database administrator
  - name: Mattie
    job: Back-end developer

yq

https://kislyuk.github.io/yq/

Get value of all element's key

echo /tmp/team.yml | yq '.members[].name'

Returns

"Jane"
"Mattie"

Text

Text

echo

tr

grep

cat (concatenate)

Original use

Hacks

head

tail

split

cut

diff

comm

wc

awk

sed

uuid

pandoc

Specific

JSON

jq

Check JSON

Properties (value)

Arrays

CSV

Miller

YAML

yq

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally