Tools: Linux Fundamentals for Data Engineering
Using the CLI
Navigating the CLI Linux stands out as a usefool tool in data engineering because of it's unique features: the Command Line interface CLI, Compatibility with most Data Tools, Security and Scalability as well as cost effectiveness due to being an open source platform. These attributes make the work of an individual or organisation in Data Engineering easier. As a beginner here are some of the things to look out for as you start your journer in Data Engineering. The Command Line Interface is a tool used to interact with programs using commands, more like shortcuts to get things done faster.
It involves typing reserved words(Commands) in an interface that does not allow use of other input devices such as a mouse. While most beginners find this unconventional, mastery of the CLI will make you realise it is one of the easiest and most convenient tools moreso in Data Engineering. The CLI can be used to: Here is what you need to get started: While the CLI usually seems intimidating to a new user, familiarity and ease builds up by knowing the right tips and tricks. Get used to using the keyboard only. To run a command type the command and hit enter(the cli will respond by running the command or give you an errorTo clear your screen - use a command clear or ctrl + l To use a recently used command - use the up arrow keyTo interrupt a process before it completes - ctrl + cTo autocomplete use tab keyto Copy or paste text - ctrl + shift + c/vTo open a new tab - ctrl + shift + T File management is important to allow or deny different users to make changes to your files. Think of 3 people working on a document where Person A is allowed to view, edit and process the document, Person B can only do two of the activities and person C can only do one. this protects the document from unintentional distortion or unwanted changes. A rule of thumb is the principle of least privilege where a user is given only the level of access they need. Let us look at an example of how to work with terminal as a data engineer Create a directory, navigate to the directory and create a new file. -file.txt Add some text to the file using vi or nano (inbuilt editors) We can also download other tools using the CLI. An example is Docker - a containerization tool used by engineers to build, ship and run applications in lightweight packages called containers. On a linux based terminal: Run the script
sudo sh get-docker.sh Check if Docker is successfully installed docker --version or docker version for extra details Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse