Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
wiki:summerofcode [2015/02/20 21:45]
webadmin
wiki:summerofcode [2018/01/27 00:49] (current)
webadmin
Line 1: Line 1:
-====== Google Summer of Code 2015 Projects List======+====== Google Summer of Code 2018 Projects List ====== 
 +The following projects are available for GSOC 2018 students.
  
-The following list of projects ​for GSOC 2015 would greatly benefit multiple or specific wildlife research communities using Wildbook.+  * Project 1. Wildbook ​for Whale Sharks: Listen for YouTube Responses and Extract Wildlife Data (When/​Where) 
 +  * Project 2. Flukebook: Multi-species Configuration for Cetaceans with YAML 
 +  * Project 3. Wildbook: Quickstart Wizard for Common Configuration Parameters 
 +  * Project 4Wildbook: Computer Vision Visualization (Help Us See What the Computer Sees!) 
 +  * Project 5. Wildbook: Visualize Animal Co-Occurrence with D3.js 
 +  * Project 6. Flukebook: Build an Intelligent Agent for a Flickr Community of Whale Enthusiasts 
 +  * Project 7. Wildbook: i18n for a Multilingual Userbase 
 +  * Project 8. Wildbook: Improved Data Navigation ("​Breadcrumbs",​ Object Hierarchy)
  
-\\ +Related repo: [[https://github.com/WildbookOrg/Wildbook]]
-{{http://www.wildme.org/wordpress/index_files/​wildbook_header_small.png|}}+
  
-====Code Projects to Support Our Whole Wildlife Research Community through Wildbook==== +Mentor bios: [[http://​www.wildme.org/​contact/]]. All mentors are full-time ​Wild Me staff.
-  * While Wild Me is investing heavily in computer vision initiatives,​ such as [[http://​www.IBEIS.org|IBEIS]],​ there are still a lot of species for which researchers use the human brain to manually identify individual animals in photographs. For these researchers,​ we need an in-browser, user-friendly photo matching tool that allows them to filter their images down to a subset of comparable photos and examine them side-by-side for potential matches. We have experience from several approaches to this to help you develop a new workflow for better photo matching for wildlife. +
-  * Help us bring wildlife into your social network! Imagine if our online social networks spanned species, allowing us to follow wildlife under study just as we follow our friends. Wild me is pioneering this approach. [[https://​www.youtube.com/​watch?​v=Q_v2mXnbQP0|Check out our YouTube video]], and then [[http://fb.wildme.org/​wildme/public/​|check out the Facebook app]]. We need your help to further develop the social media spin of wildlife data by: +
-    * Helping us test and harden the Wild Me social media platform and prepare it for broader use. +
-    * Helping us build a Google+ integration. +
-    * Helping us build a Twitter integration. +
-  * Storytelling is becoming a popular way of telling data-driven stories. We would like the public - in addition to contributing data to science - to be able to tell the story of their wildlife experience. We want to combine relevant stories and data to tell the story of each animal'​s life that is under study. Tools like [[http://​storymap.knightlab.com/​|StorymapJS]] provide a basis to create very engaging, story-based content, allowing us to present the lives of amazing animals in easy-to-understand detailCan you help us tell each animal'​s story by mixing data and multimedia in a story telling framework in Wildbook?+
  
 +=== Project 1. Wildbook for Whale Sharks: Listen for YouTube Responses and Extract Wildlife Data (When/​Were)===
  
-Learn more about our efforts to bring wildlife into social media here:\\ +//**Summary**:​ Use Java programming to listen for reply comments on whale shark-related YouTube videos, and run those comments through Google Cloud Vision, Google Machine Translation,​ and Natural Language Processing to help qualify whale shark sightings with relevant information about when and where the animal was sightedThis is a great opportunity to get exposure to multiple forms of A.I. (computer vision, machine translation,​ OCR, NLP, etc.) while developing unique software that aids wildlife researchers in the field.//
-\\ +
-<​html>​ +
-  <iframe src="​http:​//www.youtube.com/​embed/​Q_v2mXnbQP0"​ +
-   ​width="​560"​ height="​315"​ frameborder="​0"​ allowfullscreen><​/iframe>​ +
- </html>+
  
-\\ +//​**Difficulty**:// Moderate (High concept, intricate work but low volume)
-{{:wiki:​whaleshark_header_small12.png|}}+
  
-====Code Projects ​to Support ​the World'​s Biggest Fish: the Whale Shark====+**Description**:​ In June 2017, we deployed a novel use of our existing computer vision and basic artificial intelligence tools (see wildbook.org) ​to pro-actively data mine YouTube for whale shark sightings and replace human labor with automation and an A.I. that interacts with YouTube posters. The software performs ​the following tasks: 
 +  - Download each 24 hours of YouTube videos tagged\titled “whale shark” (English or Spanish). 
 +  - Use computer vision to extract keyframes and detect and cluster those in which a whale shark has been identified. A trained neural network is used to make this decision. 
 +  - Send video keyframes to Google’s Cloud Vision API to extract text from the frame (e.g., embedded dates) through optical character recognition (OCR) 
 +  - Take OCR text output, video title, and video description and send it as a single string to Google Translate for language detection (neural machine translation). If non-English language is detected, translate to English. 
 +  - Use Natural Language Processing (NLP) - another form of A.I. - to detect the date of the whale shark sighting (e.g., “yesterday”,​ “last week”, “11/​13/​2014”,​ etc.) 
 +  - Use string matching to try to determine where the sighting occurred based on our existing location categorizations in whaleshark.org 
 +  - Organize the relevant keyframes and metadata (where and when) and submit them altogether to whaleshark.org 
 +  - Automatically post questions (where? when?) to posters if metadata is missing. 
 +  - Post scientific decisions (“This is whale shark A-100 and here’s a link to everything we know about it!”) back to YouTube posters on YouTube. ​
  
-  ​[[http://www.whaleshark.org|Whaleshark.org]] is based on Wildbook and uses spot pattern recognition to identify individual whale sharks based on their unique patterning of spots behind the gills. The tool we use to map the spots is a Desktop-based Java app that forces the user outside of Wildbook for manual spot mapping before transmitting the spots back to Wildbook. ​We need to save time, reduce code, and improve the workflow by allowing for spot mapping in the browser using JavaScript. This widget needs to be tablet-friendly and directly integrated ​with Wildbook'​s base open source. Researchers for seven-gill sharks and nurse sharks will also benefit from this improvement! +//​**Expected outcomes**:// We need your help with this step above"​Automatically post questions (where? when?) to posters if metadata is missing." Currently, we have no automated way of listening to those comments ​and must manual review them for date and location informationWe need you to create ​listening service that monitors identified videos ​for comment replies ​and feeds those into our natural language processing pipeline automatically,​ allowing needed information ​to be automatically fed into the whale shark research community ​and then thanking OPs for their follow up to our questions!
-  * [[http://​www.whaleshark.org|Whaleshark.org]] uses two pattern recognition algorithms for matching whale shark patterns: Modified Groth and I3S. It has been almost a decade since these algorithms were originally developed and tuned using small whale shark image datasets. For example, the spot patterns used for tuning the Modified Groth algorithm represent approximately 0.3% of the total patterns now available in whaleshark.org. Though the Modified Groth algorithm has apparently scaled well with the growth of whaleshark.org, we have no clear sense of how well it could perform if re-tuned ​and improved on a larger dataset, thereby potentially reducing misidentification probabilities ​for the research community as a wholeTherefore, we seek to improve pattern recognition with the following research and analysis tasks requiring custom code:  +
-    * Analysis: Determine average match score as function of years between sightings ​for Modified Groth and I3S to determine whether age causes significant spot displacement,​ and incorporate ​into the algorithms'​ code any time-based comparison adjustments accordingly. This analysis ​and adjustment represent a publishable result. The project will require custom code and work with Google App Engine or the Amazon Cloud. +
-    * Analysis: Retune the Modified Groth and I3S algorithms with much larger datasets, creating maxima/​minima visualizations of correct match, missed match, and false positive data to find optimum matching performance using the very large collection of images now available on whaleshark.org. This effort will yield a publishable result. The project will require custom code and work with Google App Engine or the Amazon Cloud.+
  
-Lear more about whaleshark.org here:\\ +**Figure 1Individual whale sharks can be identified from tourist videos**
-\\ +
-<​html>​ +
-  <iframe src="​http://​www.youtube.com/​embed/​Sd_2JO68ymE"​ +
-   ​width="​560"​ height="​315"​ frameborder="​0"​ allowfullscreen></​iframe>​ +
- </​html>​+
  
-\\ +Skills required/​preferred:​ Java programming,​ Google Machine Translation and YouTube API experience. 
-\\+ 
 +//​**Possible mentors**://​ Jason Holmberg (founder) and/or Mark Fisher, PhD. We will orient you to the existing APIs and help you build your listening service. 
 + 
 +{{:​youtubeexample.png?​360|}} 
 + 
 + 
 +===Project 2. Flukebook: Multi-species Configuration for Cetaceans with YAML=== 
 + 
 +//​**Summary**//:​ Some of our Wildbooks are used to study multiple species as they co-occur in nature. For each species, researchers may record different types of measurements and different genetic values. Help us better support researchers by converting Wildbook from key-value pair configuration to a hierarchical format with YAML that allows for configuration by species. 
 + 
 + 
 +//​**Difficulty**://​ Easy 
 + 
 + 
 +**Description**:​ The Wildbook project grew out of a single species research effort to apply computer vision to whale sharks. As other researchers asked us to use the software, we open sourced the code and allowed its behavior to change according to simple key-value pairs in .properties files. However, the Flukebook.org project is now leading the way in studying multiple species simultaneously,​ and not every chosen configuration applies to all species. Patterning codes, types of body measurements,​ microsatellite DNA markers, types of photo keywords, and other values differ from species to species. We need your help to re-write our properties loader and related classes to support hierarchical,​ species-specific configuration in YAML, allowing Wildbook to flexibly define its displayed data and functions based on the species under study. HTML, CSS and JavaScript work is also needed to allow web pages to adapt to the loaded configuration dynamically. 
 + 
 + 
 +//​**Expected outcomes**//:​ Use Java to re-write out configuration loader to support YAML. Collect requirements for a multi-species cetacean project (Flukebook.org) and create the required configurations in YAML to reflect real world research data collection of wildlife biologists studying humpback whales and sperm whales in the field. Use JavaScript and CSS to make pages responsive to species-specific configurations. 
 + 
 + 
 +//​**Possible mentors**://​ Jason Holmberg and/or Drew Blount (Flukebook.org lead developer). 
 + 
 +===Project 3. Wildbook: Quickstart Wizard for Common Configuration Parameters=== 
 + 
 +//​**Summary**//:​ Use your JavaScript, CSS, HTML, and Java skills to build a quick start wizard for Wildbook, helping biologists to quickly configure it for their species upon first startup. 
 + 
 + 
 +//​**Difficulty**://​ Easy. Best done with Project 2 above (YAML configuration). 
 + 
 + 
 +**Description**:​ The behavior and configuration options for Wildbook are currently defined in properties files, which are generally not understood by nor well described for biologists seeking to use Wildbook. Use your programming skills to develop a slick, question-based interface that describes needed configuration choices to biologists first starting Wildbook and guide them through the process of making good choices before saving their results to the configuration resources for persistence. 
 + 
 + 
 +//​**Expected outcomes**//:​ When Wildbook first starts, your configuration wizard quickly and easily helps biologists configure Wildbook for their wildlife species and research techniques of interest. 
 + 
 + 
 +//​**Possible mentors**://​ Jason Holmberg, Colin Kingen 
 + 
 +===Project 4. Wildbook: Computer Vision Visualization (Help Us See What the Computer Sees!)=== 
 + 
 +//​**Summary**//:​ The Wildbook Image Analysis pipeline can detect instances of animal species in images and even detect the individual animals from each detection. Use your RESTful JavaScript skills to render detected objects and their weights in the browser for user review. 
 + 
 + 
 +//​**Difficulty**://​ Moderate (High concept, moderate coding complexity) 
 + 
 + 
 +**Description**:​ Wildbook uses the OS library [[https://​github.com/​dimsemenov/​photoswipe|PhotoSwipe]] to display images to users review sighting records, search results, individual life histories, etc. Many of these images have been run through computer vision, and objects, object types, and weights have been determined from the image content. Each detection inside an image is called an "​annotation"​. However, these annotations exist in the Image Analysis database but are not directly displayed to users to evaluate and potentially provide feedback to the computer vision algorithms (e.g., "No, this is not a giraffe in this picture."​). We want you to use your JavaScript skills to modify PhotoSwipe to make RESTful/​JSON calls to our Image Analysis server and render annotation details to users in the browser. 
 + 
 +{{:​madagascar-annotated-860x574.png?​600|}} 
 + 
 + 
 +//​**Expected outcomes**//:​ Your awesome JavaScript consults Image Analysis for displayed images in Wildbook, determines if they have been run through computer vision (or if they are currently running), and displays detected "​annotations"​ with bounding boxes, types, and weights. 
 + 
 +//​**Possible mentors**://​ Jon Van Oast 
 + 
 +===Project 5. Wildbook: Visualize Animal Co-Occurrence with D3.js == 
 +//​**Summary**//:​ Wrestle D3.js to the ground and make it render animal co-occurrence diagrams for search results in Wildbook. 
 + 
 + 
 +//​**Difficulty**://​ Hard (but awesome!) 
 + 
 + 
 +**Description**:​ D3.js has a bit of a learning curve, but once you know it, it is an amazing visualization tool for complex relationships. We want you to help wildlife biologists visualize the co-occurrence and social relationships of their study animals. A "​picture speaks a thousand words" and might lead to new insights into how animals migrate and interact. Create a [[http://​bl.ocks.org/​MoritzStefaner/​1377729|force-directed D3.js layout]] (or choose a better layout!), and use your strong JavaScript and RESTful/​JSON skills to render relationships between individual animals in the browser. Expect biologists'​ minds to be blown! 
 + 
 +{{::​forcedlabel.png?​600|}} 
 + 
 +//​**Expected outcomes**//:​ When an Individual Search is executed in Wildbook, a D3.js force-directed graph of co-occurrence relationships is visible as a results option in Wildbook.  
 + 
 + 
 +//​**Possible mentors**://​ Jason Holmberg 
 + 
 +===Project 6. Flukebook: Build an Intelligent Agent for a Flickr Community of Whale Enthusiasts=== 
 + 
 +//​**Summary**//:​ Apply your Java skills, learn the Flickr API, and blend multiple forms of A.I. to create an intelligent agent that automatically helps [[https://​www.flickr.com/​photos/​flukematcher|a community of whale watching naturalists in Flickr]] rapidly identify the humpback whales they photograph and share online. 
 + 
 + 
 +//​**Difficulty**://​ Hard (but seriously...A.I. experience is a great thing for a resume), integrating computer vision calls (existing APIs), NLP, neural machine translation,​ OCR, etc. Google Cloud Vision and Machine Translation APIs are used for processing visual and textual data. We have built similar agents for YouTube and Twitter, so we know we can help you succeed. 
 + 
 + 
 +**Description**:​ The [[https://​www.flickr.com/​photos/​flukematcher|Flukematcher Flickr community]] is a group of humpback whale watching enthusiasts manually comparing photos pg humpback flukes, which are individually identifiable by their white and black contrast and by the unique, soundwave-like signature of their trailing edges. We already have the computer vision technology to match these flukes in images. We need you to apply your programming skills to build an intelligent agent that listens for new photo posts to the Flukematcher community and to run those posts automatically through our computer vision and NLP pipelines, quickly responding back to this community with the answer to "Which whale does this fluke belong to?" Social media APIs can be squirrely, so you'll need to build original code that connects Flickr to Wildbook, but the result will be a truly amazing and interactive blend of humans and A.I. studying whales! 
 + 
 +{{::​flukematcher.jpg?​600|}} 
 + 
 + 
 +//​**Expected outcomes**//:​ As a result or your amazing coding skills, the [[https://​www.flickr.com/​photos/​flukematcher|Flukematcher FLickr community]] has an intelligent agent suggesting the identities of the humpback whales sighted by a community of whale watching naturalists and guides. 
 + 
 +//​**Possible mentors**://​ Jon Van Oast and Jason Holmberg, who have succeeded in deploying a similar agent for YouTube videos. 
 + 
 +===Project 7. Wildbook: i18n for a Multilingual Userbase=== 
 + 
 +//​**Summary**//:​ Review i18n static code analysis results and help us find and fix embedded strings, locale-sensitive methods, and other i18n issues to improve the usability of Wildbook in multiple languages. 
 + 
 + 
 +//​**Difficulty**://​ Easy (or difficult if you want bonus points for helping us support right-to-left languages, such as Arabic) 
 + 
 + 
 +**Description**:​ Wildbook is used by wildlife biologists across the globe in a variety of languages. While we have externalized strings and worked hard to ensure UTF-8 text is used universally,​ we haven'​t caught all of the potential bugs that can appear in non-English usage if Wildbook. Use your Java and JavaScript programming to help us expand and improve our Spanish, French, and Finnish support for Wildbook to other languages by reviewing i18n static code analysis results and removing embedded strings, changing to UTF-8 compliant methods, and reviewing localized UI interfaces for a good experience in non-English left-to-right languges. 
 + 
 + 
 +//​**Expected outcomes**//:​ Your awesome efforts have fixed i18n  bugs in Wildbook, giving biologist users in non-English languages a great, localized experience with the software. 
 + 
 + 
 +//​**Possible mentors**://​ Jason Holmberg 
 + 
 +=== Project 8. Wildbook :  Improved Data Navigation and Visualization ​ ("​Breadcrumbs",​ Object Hierarchy) === 
 + 
 +//​**Summary**:​ Use Java programming to retrieve objects, determine their relationships and build a intuitive visual navigation system at the top of primary pages using Javascript, CSS and HTML.// 
 + 
 +//​**Difficulty**://​ Easy  
 + 
 +**Description**:​ The hierarchy of objects in Wildbook is long, and can be challenging to navigate and visualize in complex projects. We will implement a "​breadcrumbs"​ style UI 
 +at the top of main data pages to improve this. At its greatest depth, a user should be able to see the chain of relationship from a individual animal to an  
 +"​Encounter"​ with an animal at a certain time and place, to an "​Occurrence"​ or group sighting of animals, all the way back to the survey where the data was recorded. A the top level, a user should be able to see from the Survey all the associated data points, i.e. this survey has 14 Occurrences,​ 39 Encounters, 11 Identified individuals and provide navigation options. This simple UI will reside at the top of high usage pages and improve workflow for developers and researchers across multiple Wildbook implementations.  
 + 
 +//​**Expected outcomes**://​  
 + 
 +  - Use Java to retrieve objects and relationships from the database in an efficient way that does not significantly hurt load times.  
 +  - Create a responsive visualization bar for these relationships using CSS and HTML and Bootstrap.  
 +  - Create lightweight navigation between these levels, handling presence and lack thereof using Javascipt, jQuery. 
 +  - Possible new SQL/​Datanucleus queries to improve load times.  
 + 
 +Skills required/​preferred:​ Java, CSS, HTML. 
 +Nice to have: SQL (Postgres), Datanucleus.  
 + 
 +//​**Possible mentors**://​ Colin Kingen, Wildbook Engineer.