16 Bash Commands Data Scientists Must Know

Bash commands are an important part of the data scientist’s toolkit. This guide introduces you to some of the most important ones.

A hand holding Thor's hammer Mjolnir
Image: Shutterstock / Built In
Brand Studio Logo
UPDATED BY
Matthew Urwin | Mar 06, 2025
Summary: Data bash commands like the ls command, cp command and grep command help with navigating directories, managing files, searching text and more. Here’s a full list of the top bash commands data scientists need to know to simplify workflows and boost their productivity.

Data scientists need a basic understanding of bash and its commands. Often referred to as the terminal, console or command line, bash is a Unix shell that can help you navigate within your machine and perform certain tasks.

In this article, we’re going to explore a few of the most commonly used bash commands that every data scientist must know.

16 Bash Commands Data Scientists Must Know

  1. ls Command
  2. cd Command
  3. rm Command
  4. mv Command
  5. cp Command
  6. mkdir Command
  7. pwd Command
  8. touch Command
  9. cat Command
  10. less Command
  11. more Command
  12. grep Command
  13. curl Command
  14. which Command
  15. top Command
  16. history Command

More in Data ScienceAn Introduction to Python Linked List and How to Create One

 

16 Bash Commands to Know

1. ls Command

The ls (list) command is used to list directories or files. By default (i.e., running ls with no options at all) the command will return the directories and files of the current directory, excluding any hidden files. Some of the most useful options are:

  • ls -a: List all the files in the current directory including hidden files too
  • ls -l: Long listing of all the files and their size in the current directory

Syntax

ls [OPTIONS] [FILES]

Example

A code readout for a bash command
A long list of all directories and files (including hidden) of the current directory. Image: Screenshot by the author.

$ ls -la

2. cd Command

The cd (change directory) command is used to navigate the directory tree structure.

Syntax

cd [OPTIONS] directory

The command can take only two options: -L to specify if symbolic links should be followed or -P to specify that they shouldn’t.

Example

A code readout for a bash command
Changing the current directory. Image: Screenshot by the author.

$ cd myproject

3. rm Command

The rm (remove) command is used to delete files, directories or even symbolic links from your file system. Some of the most useful options are:

  • rm-i: Remove all the files in the directory but let the user confirm before deleting it.
  • rm-r: Remove non-empty directories including all the files within them.
  • rm-f: Remove files or directories without prompting even if they are write-protected — the f stands for force.

Syntax

rm [OPTIONS]... FILE...

Example

A code readout of a bash command
Force deletion of the directory with name “directoryName.” Image: Screenshot by the author.

$ rm -rf directoryName

4. mv Command

The mv (move) command is used to move one or more directories or files from one location in the file system to another.

Syntax

mv [OPTIONS] SOURCE DESTINATION

  • SOURCE can be one or more directories or files
  • DESTINATION can be a file (used for renaming files) or a directory (used for moving files and directories into other directories).

Example

A readout of a bash command
Image: Screenshot by the author.
# Rename file
$ mv file1.txt file2.txt

# Move a file into a different directory
$ mv file1.txt anotherDir/

5. cp Command

Cp is a utility that lets you copy files or directories within the file system. Some of the most useful options are:

  • cp -u file1.txt file1_final.txt: Copy the content of file1.txt into file1_final.txt only if the former (source) is newer than the latter (destination).
  • cp -R myDir/ myDir_BACKUP: Copy directories
  • cp -p file1.txt file1_final.txt: Copy file1.txt and preserve ownership

Syntax

cp [OPTIONS] SOURCE... DESTINATION

  • SOURCE may contain one or more directories or files
  • DESTINATION must be a single directory or file

Example

A readout of a bash command
Image: Screenshot by the author.
# Copy files
$ cp file1.txt file1_final.txt

# Copy directories (and preserve ownership)
$ cp -Rp myDir/ myDirBackup

6. mkdir Command

The mkdir command is useful when it comes to creating new directories in the file system.

Syntax

mkdir [OPTION] [DIRECTORY]

  • DIRECTORY can be one or more directories

Example

A readout from a bash command
Creating a new directory. Image: Screenshot by the author.
# Create new directory with name myNewDir
$ mkdir myNewDir

7. pwd Command

The pwd (print working directory) command can be used to report the absolute path of the current working directory.

Example

A readout from a bash command
Reporting the path to the current working directory. Image: Screenshot by the author.
$ pwd
/Users/administrator

8. touch Command

The touch command primarily allows you to create new empty files or update the time stamp on existing files, although it can be used to update the time stamp on directories as well. If you use touch with files that already exist, then the command will just update their time stamps. If the files do not exist, then this command will simply create them.

Some of the most useful options are:

  • touch -c file1.txt: If file file1.txt already exists, then this command will update the file’s time stamps. Otherwise, it will do nothing.
  • touch -a file1.txt: Updates only the access time stamp of the file.
  • touch -m file1.txt: Updates only the modification time of the file.

Syntax

touch [OPTIONS] [FILES]

Example

A readout of a bash command
Image: Screenshot by the author.
# Create a new file (file1.txt does not exist)
touch file1.txt

# Update the access time of the file (file1.txt already exists)
touch -a file1.txt

9. cat Command

Cat is a very commonly used command that allows users to read concatenate or write file contents to the standard output.

Some of the most useful options are:

  • cat-n file1.txt: Display the contents of the file file1.txt along with line numbers.
  • cat-T file1.txt: Display the contents of the file file1.txt and distinguish tabs and spaces (tabs will be displayed as ^I in the output)

Syntax

cat [OPTIONS] [FILE_NAMES]

  • FILE_NAMES can be zero or more file names

Example

A bash command readout
Image: Screenshot by the author.
# Display the content of file $HOME/.pip/pip.conf
cat $HOME/.pip/pip.conf

# Append the content of file1.txt to file2.txt
cat file1.txt >> file2.txt

Data Science Techniques to MasterImportant Power BI Formulas for Dynamic Filters to Know

10. less Command

The less command lets you display the contents of a file one page at a time. Less won’t read the entire file when it is being called; thus, it leads to way faster load times.

Some of the most useful options are:

  • less-N file1.txt: Display the content (first page) of the file file1.txt and show line numbers.
  • less-X file1.txt: By default, when you exit less, the content of the file will be cleared from the command line. If you want to exit but also keep the content of the file on the screen use the -X option.

Syntax

less [OPTIONS] filename

Example

A bash command readout
Image: Screenshot by the author.
# Display the content of file $HOME/.pip/pip.conf
less $HOME/.pip/pip.conf

11. more Command

The more command can also be used for displaying the content of a file in the command line. In contrast to less, this command only supports forward navigation while less enables both forward and backward navigation. The less command also doesn’t have to read an entire file before displaying it page by page, making it a bit faster than the more command. 

Some of the most useful options for the more command are:

  • more -p file1.txt: Clear the command line screen and then display the content of file1.txt
  • more +100 file1.txt: Display the content of file1.txt starting from the 100th line onwards.

Syntax

more [OPTION] filename

Example

A bash command readout
Image: Screenshot by the author.
# Display the content of file $HOME/.pip/pip.conf
more $HOME/.pip/pip.conf

12. grep Command

The grep (global regular expression print) command is useful when you wish to search for a particular string in files.

Some of the most useful options are:

  • grep-v Andrew employees.txt: Invert match Andrew in employees.txt. In other words, display all the lines that do not match the pattern Andrew.
  • grep-r Andrew dirName/: Recursively search for pattern Andrew in all files in the specified directory dirName
  • grep-i Andrew employees.txt: Performs a case-insensitive search.

Syntax

grep [OPTIONS] PATTERN [FILE...]

  • PATTERN is the search pattern.
  • FILE can be none to more input file names.

Example

A bash command readout
Search for export command in the user profile. Image: Screenshot by the author.
# Search for `export` (case insensitive) in user profile
$ grep -i export ~/.bash_profile

13. curl Command

The curl command is used to download or upload data using protocols such as FTP, SFTP, HTTP and HTTPS.

Syntax

curl [OPTIONS] [URL...]

Example

A bash command readout
Image: Screenshot by the author.
$ curl -L google.com

14. which Command

The which command is used to identify and report the location of the provided executable. For instance, you may wish to see the location of the executable when calling python3.

Syntax

which [OPTIONS] FILE_NAME

Example

A bash command readout
Image: Screenshot by the author.
$ which python3
/usr/local/bin/python3

15. top Command

The top command can help you monitor running processes and the resources (such as memory) they are currently using.

Some of the most useful options are:

  • top-u myuser: Display processes for the user myuser.

Example

A bash command readout
Output of the top command. Image: Screenshot by the author.

16. history Command

The history command displays the history of the commands that you’ve recently run.

Some of the most useful options are:

  • history-5: Display the last five commands.
  • history-c: Clear the history list.
  • history-d 10 20: Delete lines 10 to 20 from the history list.

Example

A bash command readout
Get the recent commands from history that include python3 keyword. Image: Screenshot by the author.
$  history | grep python3

More in Data ScienceMachine Learning Engineers Should Use Agile for Developing Models

 

Bash Commands for Data Science

In this article, we explored only a small subset of some of the most commonly used bash commands. Data scientists must be able to use the command line as this will definitely help them perform basic tasks easily and most importantly efficiently. Although it’s not mandatory to become a bash guru, it’s a very important skill that can help you excel as a data scientist

Frequently Asked Questions

Bash is a program that lets a user interact with a system and its various components. Data scientists can use bash commands to quickly perform tasks like handling large data sets, searching text and managing files. 

The more command only supports forward navigation when viewing a file. The less command supports both forward and backward navigation during viewing, and it doesn’t have to read an entire file before displaying it. This makes it more dynamic and faster than the more command. 

Locate an installed program by using the which command paired with the program’s name. For example, simply enter “which python 3” to find Python 3.

Explore Job Matches.