CSC 212 Final Project Guidelines

About the Project

The final project in CSC212 is worth 20% of your final grade (see Syllabus). The core work for this project will occur over two weeks, and lab time will be dedicated to supporting your projects. I will be evaluating lecture topics to support the projects that students propose. Note that if your recently did a final project in CSC111, you were given 5 weeks to work on it, and this project has far less time assigned (2-3 weeks).

Group Work

The final project may be completed individually or in groups of 2-3, subject to instructor approval.

TA Hours

If you want help during TA hours, print out or bring your proposal.

Please note (as discussed in class): TA’s may not be able to help much with coding and debugging. TA’s will be most helpful with explaining Data Structures concepts that have already been covered in class.

Deliverables

Final Project Proposal (Due by end of Lab 11/22).

If you are choosing to define your own project, you must turn in a project proposal document. If you are doing one of my suggestions, you will have to identify it, but less description will be needed.

Your final project proposal will identify you and your group members, if any. It will outline what you plan to accomplish and who will complete each portion of the assignment. A fair allocation of the work must be identified.

Final Project Checkpoint (Due by Midnight, 12/6)

Every member of your group will submit a 1-page “reflection” on how much progress the project has achieved in a single week. You will assign yourself a grade reflecting your work so far, and you will assign an overall grade for your group. Here an “A” means that you have done the work necessary in the first week to accomplish your goals for an A in the second week.

Final Project Submission (Due by Midnight, 12/12, the last day of classes).

Again, every member of your group will submit a 1-page “reflection”, providing individual and group grades (as with the checkpoint) that you believe that your project deserves. Final artifacts will also be uploaded to Moodle.

Grading Criteria

The final project is more open-ended, and therefore much more subjective than any of our other assignments. The following criteria are what I am looking for in a final project.

  • Does this project challenge me?
  • Does this project advance my knowledge of Java or Data Structures?
  • Has this project been successful? Have I learned from challenges that remain?

You will notice that correctness or success is not the primary emphasis. It is difficult to scope an open-ended two-week project.

Potential Project Ideas

My final project ideas are separated into two sections: research & presentation, and research & implementation. All final projects have a research and understanding component, but in one set of projects, coding is de-emphasized (there will still be some coding for all projects).

Research & Presentation Projects

If you pursue one of these projects, you will be invited to present on December 12 (the last day of class). Alternatively, it may be possible to write a paper, but this will require permission and discussion with me.

Bio/Chem/Genetics: Suffix Arrays

Suffix Arrays are a very neat data structure that allow for rapid evaluation of string search operations that have made an impact on DNA and RNA analysis. They are quite a mathematical/complicated data structure, so I have located this suggestion in the presentation realm rather than the implementation realm. Implementing some portion of the algorithms necessary is possible.

Animation/Graphics: BSP Trees

Binary-Space Partitioning Trees (BSP) as well as Octtrees play a large role in the generation and rendering of 3D scenes in graphics and animation. What do they do? How are they formed? Since writing a 3D engine is probably not possible in two weeks, this is also located in the “presentation” realm.

The Ethics of Data Structures & Algorithms

I recommend teaming up with an SDS major: most analyses of ethics, bias, fairness, or transparency involve statistical arguments.

Data Structures in the R Programming Language (SDS?)

For students familiar with R or another programming language (approval by me). There is an opportunity for a final project that examines how data structures are used in R. (What are vectors, how does one use hash maps in R?). The final output of this project would be both visual (a powerpoint presentation or a poster) in addition to a written document describing your findings. The target audience would be other students in the class that are unfamiliar with R, or your other language, so you would want to introduce the language, how you’ve learned it, any projects you have used it for, and then discuss the data structures available, how they’re used and what implementations you think they are.

Regular Expression Tutorial

Regular Expressions are a miniature programming language that encodes methods of matching strings. Your task will be to research regular expressions, find websites that teach you how to use them, find examples that are intuitive and design a visual (a powerpoint presentation or a poster) in addition to a written document describing your findings. The target audience would be other students in the class that are unfamiliar with regular expressions, so you would want to introduce them, demonstrate how they are used, digest the one I’ve provided in WordSplitter and help your fellow students learn something about this tool. You should show how they are used from both Python and Java.

Research & Implementation Projects

These projects all have a coding requirement.

SkipListSet or SkipListMap

Skip Lists are probabilistic, linked data structures that extend our SinglyLinkedList discussion to place nodes in an order with a randomly determined height. They function as a sort of random tree that does not need any balancing.

The original Communications of the ACM article by W. Pugh, 1990 includes quite reasonable pseudocode. It is quite possible to implement this and benchmark it in the SpellChecker environment (or another one) for a final project.

Bloom Filter & Count-Min Sketch (presentation also possible)

Sometimes, using a small amount of space and answering queries quickly is more important than accuracy. Bloom Filters are (in short) a HashSet<T> that does not necessarily resolve collisions, but uses multiple Hash functions to try to minimize the chance of a false-positive. It can only answer questions of the type “Is this item in the filter?” The Count-Min Sketch is a similar data structure that estimates the number of repeated items in a set; an approximate of HashMap<T, Integer>.

Data Structure Research and Implementation

You and your group will select a Data Structure that interests you that we have not covered in an assignment or in class. You will perform a literature review of this data structure: collecting as many sources as you can about it, and then you will prepare and implementation and a set of tests for this data structure. Your final report will include an overview of why and where this data structure is used, how it relates to ones we’ve covered in class, the complexity of the operations, and what you learned from your implementation.

Data Structures that are likely to turn out well include “Interval Trees”, and “Ropes”, but others are possible.

Performance Evaluation of Data Structures

Building off the ideas of the SpellChecker project, you and your group will select a task and a set of data structures (or algorithms) that can be used for that task. These data structures may be built by you and/or correctly sourced from Java itself, textbooks, or online resources. You will design some experiments to identify how these data structures and algorithms perform with synthetic and real-world datasets and create a report (including diagrams and charts) in order to make a recommendation for your target task.

The Graph Project (N. Howe)

One traditional final project in this course has been based on implementation of a Graph application, including breadth-first, depth-first searches. One benefit of this base for your final project is that your TAs have most likely done it themselves: http://www.cs.smith.edu/~nhowe/teaching/csc212/Assignments/final.php

Redesign an CSC212 Project

At the end of the course, you have now been exposed to a lot of Java programming and a variety of data structures. You know what has worked and what has not worked for your learning. You would identify an assignment in this course (e.g., P1: Fish Tank) and your goals (remove the graphical component) and design a new assignment that achieves the same learning goals (expose students who only know Python to Java syntax and make them design classes). The output of this project would be a Github repository with a README, rubric, and starter code, as well as a separate solution for most parts of your assignment – it would be OK to design some optional parts without solutions, but this has a tendency to annoy your TAs ;)

Propositional Logic Trees

If you’ve taken logic, or like to explore a little, consider making an expression tree setup that supports AND, OR, XOR, NOT, IMPLICATION, and more. Having a variable “Node” means that you could use it to generate truth tables for expressions.

Implement a 2D Video Game

Since many students have (historically) expressed interest in video games and their design, this is an opportunity to explore. A game based on our “SpookyMansion” codebase would be allowed provided your changes to it expand upon what was done already by any of your group members. This is the project that is most likely to take way more time than you expect, so I will be looking specifically at your proposal and midpoint evaluation.

The following are games where an implementation is achievable in 2 weeks.

  • Probably too easy: 2048
  • Definitely Visual: Breakout, Pacman, Space Invaders, Simple Tower Defense
  • If you like Trigonometry: Asteroids, Basic Tower Defense
  • Potentially Text-Based: Sudoku, Chess, Checkers, Card Games, Minesweeper, Pokemon-Like Battle Game, Snake Game, Frogger
  • Probably (borderline) too hard: Tetris, Bejeweled