Page cover image

Virtual Machine Creation

Environment Creation

The first step in the process is ensuring our machine is optimised to use the NVIDIA platform and GPUs

Base Distribution: Ubuntu 22.04

We use Ubuntu 22.04 as the base image because:

-This version is known for its long-term support (LTS)

-Ubuntu is well-supported by NVIDIA

-Works well as remote instance so we can access high powered GPUs

-These remote environments allows all team members to access our platform

Integrated Development Environment (IDE)

We recommend using Visual Studio Code (VS Code) for our Integrated Development Environment

Support for Remote Development: VS Code allows remote development support, crucial for accessing and managing virtual machines with powerful GPUs

Integrated Terminal and Docker Support: The integrated terminal in VS Code enables direct interaction with command-line tools, essential for managing Docker containers and executing model training scripts.

Extensive Language Support: Large language model development often involves multiple programming languages (like Python, C++). VS Code supports a wide range of languages and their specific tooling, which is critical for such multifaceted development.

Version Control Integration: With built-in Git support, VS Code makes it easier to track and manage changes in code

Virtual Machine Requirements:

-Docker*

-CUDA (Version 12.3)*: parallel computing platform and programming model

-NVIDIA NGC: NVIDIA Container toolkit for access to NVIDIA Docker Container

-NVIDIA CUDA Toolkit*: compiler for CUDA, translates CUDA code into executable programs

-GCC: the compiler required for development using the CUDA Toolkit

-GLIBC: the GNU Project's implementation of the C standard library. Includes facilities for basic file I/O, string manipulation, mathematical functions, and various other standard utilities.

*Please note, Continuum's base virtual machine installation script installs Docker, the NVIDIA Container Toolkit and CUDA Driver 12.3 as well as the NVIDIA Container Toolkit

Primary Bash Installation

Instructions

After accessing the VM via VS Code SSH, move to the command line and download the bash installation script:

wget raw.githubusercontent.com/Cognitive-Agency/bashscripts/main/coginstall-light.sh

Give yourself the authority to execute the script:

chmod +x coginstall-light.sh

Execute the script:

./coginstall-light.sh

The bash installation script will establish a virtual machine ready for action

What is in the bash script
  1. Bash Strict Mode: The script starts with bash strict mode enabled (set -euo pipefail) to prevent potential issues from unhandled errors.

  2. Signal Trapping: It traps the INT signal to handle interruptions (like Ctrl-C) gracefully.

  3. Utility Functions: Includes functions to print messages (print_message), check if commands exist (command_exists), and print errors (print_error).

  4. Downloading Configuration Files: It replaces the existing .bashrc and .zshrc files with new versions from a specified URL.

  5. Checking Essential Commands: Verifies the presence of critical commands like curl, wget, sudo, dpkg, and getent.

  6. Snap Installation: Installs and ensures the snap package manager is running.

  7. System Update and Package Installation: Updates the system and installs a variety of packages including git, awscli, curl, vim, htop, and many others, each serving specific purposes like version control, system monitoring, text editing, etc.

  8. Docker Setup: Checks for Docker installation and sets it up if not present.

  9. Miniconda Setup: Installs Miniconda for Python environment management if it's not already installed.

  10. CUDA Drivers Installation: Downloads and installs NVIDIA CUDA drivers.

  11. NVIDIA Container Toolkit: Installs the NVIDIA container toolkit and configures it for Docker.

  12. Path Configuration: Adds CUDA to the system path.

  13. Additional Utilities Installation: Installs a variety of additional utilities like exa, the fuck, and others.

  14. Docker Images Setup: Pulls specific Docker images like NVIDIA PyTorch.

  15. Oh My Zsh Setup: Installs and configures Oh My Zsh, along with plugins and the Powerlevel10k theme.

  16. Final Summary: Prints a summary of the installations and setup processes completed.

The script is well-structured for modularity and includes informative print statements for each major step, making it user-friendly and easy to track its progress during execution. It's designed to enhance the capabilities of a Unix system with a wide array of development and system management tools.

These are the specific installations installed

The script includes the installation of a wide range of tools and utilities for Ubuntu, enhancing its capabilities for development, system administration, and general use:

  1. git: Distributed version control system.

  2. awscli: Command-line interface for AWS services.

  3. curl: Tool for making web requests.

  4. vim: Highly configurable text editor.

  5. htop: Interactive process viewer for Unix.

  6. tmux: Terminal multiplexer for managing multiple sessions.

  7. build-essential: Essential packages for compiling C programs.

  8. zsh: Advanced Bourne shell with improvements.

  9. software-properties-common: Scripts for managing software.

  10. apt-transport-https: Secure file transfer for package manager.

  11. ca-certificates: Common CA certificates for SSL.

  12. gnupg-agent: GPG agent for private keys operations.

  13. cmake: Cross-platform build process manager.

  14. gnupg: Tool for data encryption and signing.

  15. nvtop: NVIDIA GPU monitoring tool.

  16. screen: Terminal multiplexer.

  17. glances: Cross-platform system monitoring tool.

  18. parallel: Execute jobs in parallel.

  19. git-lfs: Git extension for large files.

  20. ffmpeg: Multimedia framework.

  21. bash-completion: Auto-completion for bash commands.

  22. silversearcher-ag: Fast text searcher.

  23. tldr: Simplified community-driven man pages.

  24. fzf: Command-line fuzzy finder.

  25. ncdu: Disk usage analyzer.

  26. jq: Command-line JSON processor.

  27. tree: Visual directory tree display.

  28. tmate: Remote terminal sharing tool.

  29. byobu: Window manager and terminal multiplexer.

  30. ranger: Console file manager with vi-like keybindings.

  31. bat: Enhanced 'cat' command with syntax highlighting.

  32. ripgrep: Fast text searcher.

  33. neofetch: System information tool.

  34. mc: Midnight Commander, visual file manager.

  35. iproute2: Advanced network management.

Information on Ubuntu libraries

curl: A command-line tool for transferring data using various protocols. It supports downloading and uploading files via URLs and is commonly used in scripting and automation tasks.

wget: Another command-line tool used for retrieving files from web servers. It supports downloading files and recursive downloading, making it useful for batch downloads and mirroring websites.

awscli: The AWS Command Line Interface. It provides a unified command-line interface to interact with various Amazon Web Services (AWS) resources and services. It allows you to manage and automate your AWS infrastructure and services.

git: A distributed version control system commonly used for source code management. It enables collaborative development, version tracking, and code sharing across multiple contributors.

net-tools: A package containing various networking tools, such as ifconfig for network interface configuration, netstat for network statistics, and route for managing the IP routing table.

build-essential: A package that includes essential tools and libraries required for building software on Ubuntu. It includes compilers, libraries, and development headers necessary for compiling and linking programs.

python3-dev: Development headers and libraries for building Python extensions. It is typically required when installing packages that involve compiling Python modules or using C/C++ extensions.

python3-pip: The package installer for Python. It allows you to easily install and manage Python packages from the Python Package Index (PyPI) using the pip command.

python3-venv: A module providing support for creating lightweight virtual environments for Python projects. It enables you to isolate project dependencies and environments, allowing for better dependency management and reproducibility.

htop: An interactive process viewer and system monitor. It provides a real-time overview of system resources, CPU usage, memory usage, and other system information in a more user-friendly interface than the traditional top command.

vim: A highly configurable text editor often used in the command line interface. It provides powerful editing features and syntax highlighting, making it popular among developers and system administrators.

tmux: A terminal multiplexer that allows you to manage multiple terminal sessions within a single window. It provides features like session management, window splitting, and detached sessions, enhancing productivity and workflow in the terminal.

ack: A tool for searching text files, especially source code, with advanced pattern matching and search capabilities. It is often used as a faster and more convenient alternative to the traditional grep command.

tree: A command-line utility for displaying directory and file tree structures in a visually organized format. It helps visualize the directory hierarchy and file structure, making it easier to understand and navigate within a directory.

These libraries provide a range of functionalities for networking, development, package management, version control, text editing, system monitoring, and more, depending on your specific needs.

Ubuntu file structure

/bin: Contains essential binary executables (like ls, cp, mv, rm) that are required for the bare functioning of the system. These commands are usable by both the system administrator as well as non-privileged users.

/boot: Holds files required for the system boot process, including the Linux kernel and a ramdisk image. It usually includes grub (boot loader) configuration files as well.

/dev: Contains device files, which are interface points for the hardware devices that are used by the system. Examples include /dev/null, /dev/random, /dev/mem, etc.

/etc: Contains system-wide configuration files and directories. Almost all applications store their configuration files in this directory.

/home: Home directories for all users to store their personal files.

/lib: Contains library files that contains code which can be used by multiple programs at once.

/media: Temporary mount directory for removable devices, like USBs, CD-ROMs, etc.

/mnt: Temporarily mounted filesystems.

/opt: Optional or third party software should be placed in this directory.

/proc: A virtual filesystem containing information about system resources.

/root: This is the home directory for the root user.

/run: A tmpfs (temporary file system) storing runtime variable data. It is mapped into memory and is not written to the hard disk.

/sbin: Contains binary files which are essential for the working of the system that are typically used by system administrators.

/srv: Contains data for services provided by the system.

Last updated

Was this helpful?