SD 212 Spring 2024 / Admin


Software Installs and Environment Setup

General Tips

Getting your software and programming tools working correctly can be tedious and annoying, but it’s an important part of the life of a data scientist.

Before you get started in any new install, make sure you have time to complete it. That means having plenty of battery (or an outlet available), a fast internet connection, and nowhere else to be for the next hour.

You will need all of the software packages below. Most of them you should already have from previous classes; come back here and re-install if your laptop gets wiped by ITSD or something stops working.

WSL/Ubuntu

Installation

This should be fine from SD211; it is the same as Step 1 in the SD211 setup instructions.

  1. Open a powershell as Windows administrator: hit WindowsKey+R to bring up the run dialog, type powershell, then hit Ctrl+Shift+Enter to run as administrator.

  2. Run this command in the powershell:

    Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
  3. Restart Windows

  4. Open Software Center and install “Ubuntu 20.04 LTS”

  5. After the install completes, open Ubuntu for the first time. You will be prompted to choose a username and password.

    You must choose your USNA username m2XXXXX as the username.

    Use any simple password. It doesn’t need to be (and probably shouldn’t be) the same as your USNA password, and is only used to install software and other stuff inside Ubuntu on your laptop, so security isn’t a huge concern here. Keep it simple and memorable.

Forgot Ubuntu WSL password

If you forget your (simple, non-USNA) WSL Ubuntu password at some point, it’s easy to reset it.

Just follow the instructions on this StackOverflow post, where in the first step you are opening cmd.exe from the Windows taskbar, and in the third step you use your own username like m261234.

Fix WSL networking settings

Before you start doing anything in your fresh Ubuntu WSL installation, you need to fix the DNS and certificate settings so that it will work on the USNA network.

First make sure you are actually connected to the USNA mission network. Then run this command which downloads and executes a bash script to fix things up all nice-like:

curl -s http://faculty.cs.usna.edu/~roche/fix-wsl.sh --resolve 'faculty.cs.usna.edu:80:10.1.83.71' | bash

Software update

The software inside WSL/Ubuntu is not updated automatically by ITSD/Windows like everything else on your laptop. You should do this update periodically, at the very least at the start of the semester is a good time.

First, open an Ubuntu terminal from the start menu. Then run these two commands, in order. When asked, enter your simple Ubuntu password.

sudo apt update
sudo apt full-upgrade

SD212 Directory

On a lab machine or ssh, you just need to literally make a new directory, which you can do with this command:

mkdir ~/sd212

On Ubuntu/WSL, you can run these commands in an Ubuntu terminal so that you get a directory called sd212 which is visible on your desktop as well as within Ubuntu.

winhome=$(wslpath "$(wslvar USERPROFILE)")
mkdir -p "$winhome/Desktop/sd212"
ln -sf "$winhome/Desktop/sd212" ~/sd212

SSH keys

(Should already be done from SD211.)

Setting up SSH keys makes it so that you can easily access the CS department server and lab machines through SSH without having to type your password every time.

Run these commands from an Ubuntu terminal window. On the third step, you may be prompted for a password. That should be your USNA password, not your Ubuntu password.

mkdir -p ~/.ssh
[[ -e ~/.ssh/id_ed25519 ]] || ssh-keygen -t ed25519 -N ''
ssh-copy-id "$USER"@ssh.cs.usna.edu

Mamba

Need to do this BOTH on your laptop and on a lab machine or ssh.cs.usna.edu

Mamba is a tool to manage Python packages which we already used in SD211. You should not need to install Mamba itself again, but you will need to do the second part below to create your sd212 environment.

Re-installing mamba

(Should be already done from SD211, but feel free to do again if something stops working.)

  1. Download mambaforge

    Open a terminal and run the following to download the mambaforge installer from github:

    wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh

    This will download about 100MB file called Mambaforge-Linux-x86_64.sh into your current directory.

  2. Run the mambaforge installer

    The installer you downloaded is actually a bash script. To run it, type:

    bash Mambaforge-Linux-x86_64.sh

    Follow the prompts when asked. When asked whether you want to initialize conda at the end, type yes.

SD212 Environment

(Remember you will need to do this both on your laptop in WSL, and on the lab machines or over ssh.)

Start by opening a new terminal (if you just reinstalled mamba, close that terminal and reopen a new one).

Then run these commands one at a time. (Note, the last command is one very long line to install lots of awesome data science packages. Be sure to copy-paste the whole line!)

mamba activate base
mamba update --all
mamba install python=3.12
mamba create -n sd212 python=3.12 ipython numpy pandas ipykernel matplotlib plotly seaborn scikit-learn opencv bs4 lxml nltk easygui mypy openpyxl pandas-stubs flask pypdf pycryptodome requests

Xming

(Should already be done from SD211)

Xming is a small Windows utility that let’s you display GUIs from WSL or ssh.

Install

  1. Visit https://sourceforge.net/projects/xming/
  2. Download Xming
  3. Run as administrator the file you just downloaded

Running/restarting

Xming should be running all the time on your laptop, before you open VS Code for example. If Xming is running, you will see its little X icon in the system tray on the bottom-right of the start menu.

If not, then maybe Xming got closed or crashed for some reason. You should be able to find the Xming program in the start menu and just click it to start it up again.

VS Code

(This should be already done from SD211; see step 3 in these instructions.)

Installation

  1. Go to https://code.visualstudio.com/ and download for Windows.
  2. Run the installer after it downloads.

Setup WSL

  1. Open VSCode
  2. Open the “Extensions” pane (icon with 4 squares on the left side)
  3. Search for and install the “Remote Development” extension from Microsoft.
  4. After that install completes, click the green icon at the very bottom-left and select “New WSL Window”. It should now say “WSL: Ubuntu 20.04” at the bottom left.
  5. Now open “Extensions” again. Install the Python and Jupyter extensions from Microsoft.

Setup ssh connection to ssh.cs.usna.edu

  1. Close VS Code and open a Powershell in Windows. Run this command from Powershell:

    setx DISPLAY "127.0.0.1:0.0"
  2. Close Powershell and start VS Code again.

  3. Open the Remote Explorer (it’s the computery icon on the left side of the window)

  4. In the remote explorer, click the little + sign next to SSH to add a new SSH remote host. In the box that pops up, type

    ssh m2XXXXX@ssh.cs.usna.edu -XY
  5. Now you should be able to connect to ssh.cs.usna.edu from the remote explorer. When you are connected, a whole new VS Code window will come up and it will say “SSH:ssh.cs.usna.edu” in green on the bottom-right.

  6. After you are ssh’d to ssh.cs.usna.edu, you have to install the VS Code extensions for Python and Jupyter again.

  7. To go back to your local laptop’s WSL (not SSH), just click the green icon on the bottom-left and select WSL again.