Hackathon Attendees

Name Picture Project Description Git Repo Can assist with May need assistance with Work with others Send Message
Steven

Build an enrichment tool analyser in Java.

Perl

Java

Yes
Felix

Develop a tool to determine the genetic background of a given sequencing library from mouse origin; for this, the tool will rely on the comprehensive mouse strain information collected by the Mouse Genomes Project. The aim is to identify at an early stage whether a sample really is derived from the mouse line or cross the researcher thinks it is… Tentative name: reStrainingOrder

plot.ly

No
Christel

There a different protocols to generate bisulfite sequencing libraries, and depending on how the library was made, different processing of the sequencing data is required. Charades is a tool that aims to determine the nature of a bisulfite sequencing library from the base composition in the corresponding fastq file. I started on Charades at the last hackathon but unfortunately, the project has been on ice for a bit - it's time to pick it up again!

No
Simon

I'm going to try adding vistories to seqmonk.  This will be a way to track and document the analysis which is done within the program - mixing technical details of the analysis with comment and annotation.  The idea will be that in the end you will be able to produce HTML reports which are a mix of graphics and text and which provide a complete documentation of an analysis, with enough information in it to be able to reproduce it.

Java

Perl

Linux

Anything NGS related.

Yes
Laura

I'll be working on a gene ontology tool from a previous hackathon, to add functionality for identifying potential biases and artefacts in the results. 

Yes
Michiel

Neural networks for topologically associated domain (TAD) boundary detection 

I will be continuing work on implementing a deep convolutional neural network for the detection of the genomic location of TAD boundaries. The implementation will be done using tensor flow and the network will be trained on publicly available data. As a control, I have Hi-C data from cells under conditions where no TAD organization is present. This approach should allow for straight-forward integration of replicate data. The problem is also parallelizable so an implementation using GPGPU (or other massively parallel processing) can be done, although this is likely beyond the scope of this hackathon.

For a more detailed explanation about TADs and this project, please visit the repo

Yes
Russell

To make a metastable allele database with interactive web interface.

No
Xiaohui

I'm going to be working on developing a web interface for a manually curated public Protein data sets of mouse/human.

Yes
Marco

Protein secondary structure prediction using Deep Neural Networks

Computer Science skills

Bioinformatics algorithm

Classification and prediction with machine learning tools (such as Deep Neural Networks)

Biological skills

Yes
Chengwei

Not new to Bioinformatics but not an expert either.

Yes
Paula

Design a GUI using python web framework Django to make my variant calling pipeline Cross Filter (https://github.com/lmb-seq/cross_filter.git) more user-friendly.

Yes
Samuel

Deep learning with infant EEG data

Using recently collected EEG data from 97 eight week old infants, I'll be building a deep learning Convolutional Neural Network (CNN)  that will distinguish which type of stimulus they were listening to at the time of recording (a drum beat, or a single syllable). 

Linguistics, psychology, experimental design, R.

Python, mathematics.

Yes
Jonathan

Deep learning methods for matrix classification problems in high-throughput screening.

R, Statistics, Data analysis

Machine learning, python.

Yes
Paulo

Can machine learning be used to map fragments of cell free DNA to chromosomes in the human genome?

Bowtie2 is typically used to map DNA reads from next generation sequence platforms (NGS) to the human genome (hg19). Can a machine learning algorithm identify the patterns (if any) in each human chromosome and perform a map of the DNA read to a chromosome of the human genome with accuracy comparable to that of Bowtie2?

http://bowtie-bio.sourceforge.net/bowtie2/index.shtml

People would be more than welcome to join our team.

Yes
Ciro Santilli

I want to understand what your modelling project / prior-art do, run the hello world on my computer, and then help you create the perfect README that will explain to the world why your project is awesome, which in turn will make you more famous.

My technique is simple: the very beginning of the REAMDE must contain: how to build, how to run, this is the expected output, and that output is awesome because X (where X is usually it predicts these experimental results). Here is a sample of such perfect readme (non-bio): https://github.com/cirosantilli/linux-kernel-module-cheat

I am software engineer in the semiconductor industry and know "nothing" about biology, which guarantees that if I am able to run the thing and see why it is awesome, then anyone who reads the REAME will too. Hopefully I will learn some bio in the process.

My main areas or curiosity in bioinf are, much like in software, the low level stuff: simulations of molecules and metabolic pathways for unicellular beings. My dream is to one day understand and model E. Coli to ridiculous detail: https://github.com/cirosantilli/awesome-whole-cell-simulation and answer the good old: if I modify this gene, then this happens without leaving the comfort of my living room. Will this happen in my lifetime?

 

Python, C++, Linux.

Yes
Ahood Yes
Stevie

Happy to help with projects. 

Otherwise, I have a project that could use some help: 
Currently my neighboring lab processes images of infected fish by hand: they select the area where the fish to only count the bright pixels where the bugs are in the fish. 
I'd like to automate this using machine learning. However, a complementary approach would be to write an algorithm which will detect the outline of the fish. 

python, machine learning algorithms, biology

Image processing, programming, machine learning

Yes
Riccardo

Currently attending the Computational Biology MPhil in Cambridge.

Background in medicine and neuroscience, can code in R and Python.

Happy to join existing projects and help out as I can 

Yes
Matt

Bioinformatics!

Yes
Daniel

The Allen BrainSpan dataset and the associated Allen Bran Atlases are incredibly rich resources for exploring the anatomy, electrophysiology and gene expression patterns of cells within the brain. In my project, I'm hoping to take advantage of the well-established BrainSpan dataset, containing transcriptomes of cells across developmental stages within the developing brain, to reproduce experimental findings in developmental neurobiology. More specifically, I'm hoping to create an interactive/reproducible Jupyter-style document which parses and analyses the BrainSpan dataset to recapitulate previously established transcription patterns in the developing brain to act as a tutorial publication for learning about developmental neurobiology. 

Python, C, Java. 

Yes
Dani

Graph representation of experimental designs for projects submitted to the Human Cell Atlas

Python, Java

Yes
Mallory

Graph representation of experimental designs for projects submitted to the Human Cell Atlas

Yes
Zina

Graph representation of experimental designs for projects submitted to the Human Cell Atlas.

Yes
Avish

Don't have any prior experience in bioinformatics but would be happy to assist on the programming side of things. Have a little bit of experience with machine learning - hoping to do a project on something that uses this. Also interested in biological simulations.

Java, Python, C

Biology, Computer Simulations, Statistics 

Yes
Vicky

A biologist by training, currently work on -omics datasets, particularly genomics and metabolomics. I don't have a particular project in mind, but I'd like to assist with other projects whilst learning new theory and polishing my coding skills!

Biological knowledge, R, Statistics, Data analysis

Python, programming skills, machine learning 

Yes
David

Can machine learning be used to map fragments of cell free DNA to chromosomes in the human genome?

Bowtie2 is typically used to map DNA reads from next generation sequence platforms (NGS) to the human genome (hg19). Can a machine learning algorithm identify the patterns (if any) in each human chromosome and perform a map of the DNA read to a chromosome of the human genome with accuracy comparable to that of Bowtie2?

http://bowtie-bio.sourceforge.net/bowtie2/index.shtml

People would be more than welcome to join our team.

Machine Learning / Deep Learning

Python

MATLAB 

Signal Processing

Biology

Yes
Jakub

Our problem is "Can a machine learning algorithm beat Bowtie2 in human genome (hg19) sequence alignment?"

In simplest terms, the goal will be to design a Machine Learning model that will try to recognize which chromosome a gene sequence came from.

  • General programming (C++, Java, C#, Python)
  • Web development (Javascript, PHP, web servers)
  • SQL Databases
  • Docker containers
  • Shell scripts (Bash, Batch)
  • Machine learning (Tensorflow, Keras)
  • Happy to help with anything that requires a computer science background

DNA sequencing

Yes
Peter

I founded non-profit ContentMine in 2014 to help mine the biomedical literature for all citizens on the planet. We work to develop a community and tools for readers.

Java

OpenAccess APIs

Text and Data Mining

Wikidata

 

I've love to help other participants search the literature using machines - automatically. Very keen to meet anyone doing systematic reviews or analysing tables and diagrams automatically.
 

Yes
Shuo Yes
Chao Yes
Jo

Taking care of organisation, media, promotion and engagement, I am here to help!

I have a background in Neurobiology.

In between times I might be working on a project for schools using Raspberry Pi.

Your day to day housekeeping requirements.

Yes
Catrin

To make a metastable allele database with interactive web interface.

Yes
Carol

Help make and interactive database for Metastable Epialleles

Biology

Yes
Noah Yes
William

DNA can be used as a programmable material to create structures which fold into a chosen conformation via Watson-Crick base pairing interactions. The structure can be tuned by choosing base pair identity (A, T, G, C), similar to protein tertiary structure being contingent on primary sequence of amino acids. Since custom DNA sequences can be accurately and cheaply synthesized, scientists can create low cost macromolecular structures with exceptional structural customizability.

The ability to create custom macromolecular structures whose configuration is tunable down to the nanometer scale is powerful. Uses for DNA nanotechnology have been found in molecular computing, diagnostics, optics, and superresolution microscopy. However, adoption of this new technology is hampered by text based interfaces for programs which optimize nucleotide sequence for formation of a desired structure.

We will create a web based graphical user interface (GUI) for a DNA design program. The user will draw the equilibrium structure of a nucleic acid, and the program will use various publicly available optimization methods to generate an optimum DNA sequence given the desired structure.

Python, Numpy, Scipy, TensorFlow, Scikit Learn, MEAN stack, BASH scripting

High Performance Computing, Use of the CSD3 Cluster

Bioinformatics, Statistics, Machine Learning

Molecular Simulation, Molecular Dynamics (MD), Rare Event Sampling for MD, Monte Carlo, Umbrella Sampling

DNA Nanotechnology

Yes
Kirti No