zaterdag, december 29, 2018

Data Science at the Command Line

At SURF i had the chance to attend a course 'data science at the command line'. Jeroen Janssens from Data Science Workshops B.V.  showed the attendees in a hands on workshop what you can do from the command line. The great thing: he recorded all commands, so you can play them back if you want to :-). The file with the commands can be found here.

The accompanying sheets, which also includes the Docker container you can download to execute the commands in, can be found here, but for your convenience, from the command line:

  • docker pull datascienceworkshops/data-science-at-the-command-line 
  • docker run -it --rm -v "${PWD}":/data datascienceworkshops/data-science-at-the-command-line
You get introduced to the strength of (combining) commands like awk, aws, body, cat, cowsay, csv2vw, csvgrep, csvjoin, csvlook, csvsql, curl, cut, drake, dseq, dumbplot, find, git, head, header, join, jq, less, nl, node, parallel, paste, pup, python, R, Rio, scrape, seq, shuf, sort, sql2csv, tapkee, tr, tree, uniq, wc, which, xmlstarlet, zcat .

Try it for yourself! It's really easy > open a command line, install docker, install the container and play out the commands in the file and see what happens!

Thanks to Jeroen he lets me share this. And check out his company if you or your colleagues could benefit from the power of the command line!

Image source: the book Jeroen wrote, which you can find here.

Geen opmerkingen:

Een reactie posten