Software Installs and Environment Setup
General Tips
Getting your software and programming tools working correctly can be tedious and annoying, but it’s an important part of the life of a data scientist.
Before you get started in any new install, make sure you have time to complete it. That means having plenty of battery (or an outlet available), a fast internet connection, and nowhere else to be for the next hour.
You will need all of the software packages below. Most of them you should already have from previous classes; come back here and re-install if your laptop gets wiped by ITSD or something stops working.
WSL/Ubuntu
- Laptop only (already on the lab machines)
- Should be fine from SD211 except the part below about
creating your
sd212
directory
Installation
This should be fine from SD211; it is the same as Step 1 in the SD211 setup instructions.
Open a powershell as Windows administrator: hit WindowsKey+R to bring up the run dialog, type
powershell
, then hit Ctrl+Shift+Enter to run as administrator.Run this command in the powershell:
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
Restart Windows
Open Software Center and install “Ubuntu 20.04 LTS”
After the install completes, open Ubuntu for the first time. You will be prompted to choose a username and password.
You must choose your USNA username
m2XXXXX
as the username.Use any simple password. It doesn’t need to be (and probably shouldn’t be) the same as your USNA password, and is only used to install software and other stuff inside Ubuntu on your laptop, so security isn’t a huge concern here. Keep it simple and memorable.
Within the Ubuntu terminal you just opened, run these commands to fix your bash settings:
sed -i.bak s/"@.h...033.00m.."// ~/.bashrc printf "\n\nexport DISPLAY=:0\nexport REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt\n" >> ~/.bashrc
Software update
The software inside WSL/Ubuntu is not updated automatically by ITSD/Windows like everything else on your laptop. You should do this update periodically, at the very least at the start of the semester is a good time.
First, open an Ubuntu terminal from the start menu. Then run these two commands, in order. When asked, enter your simple Ubuntu password.
sudo apt update
sudo apt full-upgrade
USNA SSL certificates
(Should already be done from SD211.)
Run this command from an Ubuntu terminal to so that it plays nice on the USNA network. If prompted, enter your simple Ubuntu password.
curl http://apt.cs.usna.edu/ssl/install-ssl-system.sh | bash
SD212 Directory
Run this command from an Ubuntu terminal so that you get a directory
called sd212
which is visible on your desktop as well as within Ubuntu.
winhome=$(wslpath "$(wslvar USERPROFILE)")
mkdir -p "$winhome/Desktop/sd212"
ln -sf "$winhome/Desktop/sd212" ~/sd212
SSH keys
(Should already be done from SD211.)
Setting up SSH keys makes it so that you can easily access the CS department server and lab machines through SSH without having to type your password every time.
Run these commands from an Ubuntu terminal window. On the third step, you may be prompted for a password. That should be your USNA password, not your Ubuntu password.
mkdir -p ~/.ssh
[[ -e ~/.ssh/id_ed25519 ]] || ssh-keygen -t ed25519 -N ''
ssh-copy-id "$USER"@midn.cs.usna.edu
Mamba (was Conda)
Need to do this BOTH on your laptop and on a lab machine or midn.cs
In SD211 we used conda
to install Python packages. That works, but it can
get really slow at times. So, this semester we’ll use a newer tool called mamba
.
Mamba actually does exactly the same things as Conda and uses all the same packages, but it runs way faster. Here’s how to install it.
Download mambaforge
Open a terminal and run the following to download the mambaforge installer from github:
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
This will download about 100MB file called
Mambaforge-Linux-x86_64.sh
into your current directory.Run the mambaforge installer
The installer you downloaded is actually a bash script. To run it, type:
bash Mambaforge-Linux-x86_64.sh
Follow the prompts when asked. When asked whether you want to initialize conda at the end, type
yes
.Close the terminal and reopen a new terminal
(You have to start a new terminal session for bash to know about the installation you just did.)
Install a new sd212 environment with a bunch of packages
We will start with all the packages we used in sd211, but may add more later in this semester.
Run this from the command line in your new terminal:
mamba create -n sd212 numpy pandas ipykernel matplotlib plotly seaborn scikit-learn opencv bs4 lxml nltk easygui wordcloud openpyxl
This will need to download a bunch of packages totaling around 400MB. Type
Y
when prompted and watch it go!This might take a minute if you are on a slow connection, but should be much faster than when we used conda instead of mamba!
(Optional) Remove anaconda
From the command line, run this command to wipe out your old conda stuff since we are using mamba now:
rm -rf ~/anaconda3
Xming
Xming is a small Windows utility that let’s you display GUIs from WSL or ssh.
Install
- Visit https://sourceforge.net/projects/xming/
- Download Xming
- Run as administrator the file you just downloaded
Running/restarting
Xming should be running all the time on your laptop, before you open VS Code for example. If Xming is running, you will see its little X icon in the system tray on the bottom-right of the start menu.
If not, then maybe Xming got closed or crashed for some reason. You should be able to find the Xming program in the start menu and just click it to start it up again.
VS Code
(This should be already done from SD211; see [step 3 in these instructions][211].)
Installation
- Go to https://code.visualstudio.com/ and download for Windows.
- Run the installer after it downloads.
Setup WSL
- Open VSCode
- Open the “Extensions” pane (icon with 4 squares on the left side)
- Search for and install the “Remote Development” extension from Microsoft.
- After that install completes, click the green icon at the very bottom-left
and select “New WSL Window”. It should now say “WSL: Ubuntu 20.04” at the bottom left.
- Now open “Extensions” again. Install the Python and Jupyter extensions from Microsoft.
Setup ssh connection to midn.cs
Close VS Code and open a Powershell in Windows.
Run this command from Powershell:
setx DISPLAY "127.0.0.1:0.0"
Close Powershell and start VS Code again.
Open the Remote Explorer (it’s the computery icon on the left side of the window)
In the remote explorer, click the little + sign next to SSH to add a new SSH
remote host. In the box that pops up, type
ssh m2XXXXX@midn.cs.usna.edu -XY
Now you should be able to connect to midn.cs.usna.edu
from the remote explorer.
When you are connected, a whole new VS Code window will come up and it will say
“SSH:midn.cs.usna.edu” in green on the bottom-right.
After you are ssh’d to midn.cs, you have to install the VS Code extensions for
Python and Jupyter again.
To go back to your local laptop’s WSL (not SSH), just click the green icon on
the bottom-left and select WSL again.
- Open VSCode
- Open the “Extensions” pane (icon with 4 squares on the left side)
- Search for and install the “Remote Development” extension from Microsoft.
- After that install completes, click the green icon at the very bottom-left and select “New WSL Window”. It should now say “WSL: Ubuntu 20.04” at the bottom left.
- Now open “Extensions” again. Install the Python and Jupyter extensions from Microsoft.
Setup ssh connection to midn.cs
Close VS Code and open a Powershell in Windows.
Run this command from Powershell:
setx DISPLAY "127.0.0.1:0.0"
Close Powershell and start VS Code again.
Open the Remote Explorer (it’s the computery icon on the left side of the window)
In the remote explorer, click the little + sign next to SSH to add a new SSH
remote host. In the box that pops up, type
ssh m2XXXXX@midn.cs.usna.edu -XY
Now you should be able to connect to midn.cs.usna.edu
from the remote explorer.
When you are connected, a whole new VS Code window will come up and it will say
“SSH:midn.cs.usna.edu” in green on the bottom-right.
After you are ssh’d to midn.cs, you have to install the VS Code extensions for
Python and Jupyter again.
To go back to your local laptop’s WSL (not SSH), just click the green icon on
the bottom-left and select WSL again.
Close VS Code and open a Powershell in Windows. Run this command from Powershell:
setx DISPLAY "127.0.0.1:0.0"
Close Powershell and start VS Code again.
Open the Remote Explorer (it’s the computery icon on the left side of the window)
In the remote explorer, click the little + sign next to SSH to add a new SSH remote host. In the box that pops up, type
ssh m2XXXXX@midn.cs.usna.edu -XY
Now you should be able to connect to midn.cs.usna.edu
from the remote explorer.
When you are connected, a whole new VS Code window will come up and it will say
“SSH:midn.cs.usna.edu” in green on the bottom-right.
After you are ssh’d to midn.cs, you have to install the VS Code extensions for Python and Jupyter again.
To go back to your local laptop’s WSL (not SSH), just click the green icon on the bottom-left and select WSL again.