Science is increasing accomplished via computational methods and data analyses. As such, computer scientists have worked to build infrastructure and techniques to enable such work. Provenance is one of the tools that helps users keep track of experiments and computational pipelines for increasingly complex analyses. Such information helps others trust results and reproduce them. In addition, keeping track of the different types of big data often requires new data management solutions like those for structured data.
Useful background: databases, data management
|9/4||Introduction||Review Syllabus & Web Page|
|9/9||Provenance||Provenance for Computational Tasks: A Survey||Project Ideas|
|9/11||Scientific Workflows||Scientific Workflow Management and the Kepler System||Reading Response|
|9/16||Scientific Workflow Provenance||A Framework for Collecting Provenance in Data-Centric Scientific Workflows||Reading Response|
|9/18||Workflow Evolution Provenance||Managing Rapidly-Evolving Scientific Workflows||Reading Response, Presentation Topics|
|9/23||System-Level Provenance||Provenance-Aware Storage Systems, ES3 Slides (Frew & Slaughter)||Reading Response|
|9/25||Database Provenance||Provenance in Databases: Past, Present, and Future, Provenance in Databases: Why, How, and Where, Integrating Workflow & Database Provenance Slides (Chirigati & Freire)||Reading Response (1st paper)|
|9/30||Provenance Storage||Efficient Provenance Storage, Kepler Provenance Model & Storage Slides (Anand)||Reading Response|
|10/2||Project Proposals||Project Proposal|
|10/7||Querying Provenance||Querying and Managing Provenance through User Views in Scientific Workflows, PQL Slides (Holland et al.)||Reading Response|
|10/9||Map-Reduce Provenance, Provenance Analytics||Provenance for Generalized Map and Reduce Workflows, Examing Statistics of Workflow Evolution||Reading Response (choose one)|
|10/14||Provenance Mining||Process Mining Based on Clustering: A Quest for Precision, Clustering Workflows Slides (Santos et al.)||Reading Response|
|10/16||Secure Provenance||The Case of the Fake Picasso: Preventing History Forgery with Secure Provenance, Provenance & Privacy Slides (Davidson et al.)||Reading Response|
|10/21||Provenance & Semantics||Janus: from Workflows to Semantic Provenance and Linked Open Data, Semantic Web & Linked Data Slides (Hassanzadeh)||Reading Response|
|10/23||Provenance Standards||PROV Model Primer, PROV Tutorial (Moreau et al.)||Reading Response|
|10/28||Visualization & Provenance||Supporting the Analytical Reasoning Process in Information Visualization||Reading Response|
|10/30||Visualization & Provenance||Generating Photo Manipulation Tutorials by Demonstration, Graphical Histories for Visualization: Supporting Analysis, Communication, and Evaluation, Nonlinear Revision Control for Images||Reading Response (choose one)|
|11/4||Reproducibility||Reproducible Research in Computational Science, Reproducible Epidemiologic Research||Reading Response (2nd paper)|
|11/6||Reproducibility||Making scientific computations reproducible, BURRITO: Wrapping Your Lab Notebook in Computational Infrastructure, ReproZip: Using Provenance to Support Computational Reproducibility||Reading Response (either 1st or both 2nd & 3rd)|
|11/11||No class (Veterans Day)|
|11/13||Project Progress Reports||Sample Report||Project Progress Report|
|11/18||Graph Databases||Survey of Graph Database Models (Read paper, scan appendix), A Comparison of a Graph Database and a Relational Database||Reading Response (1st paper)|
|11/20||Graph Indexing||Graph Indexing: A Frequent Structure-basd Approach||Reading Response|
|11/25||Scientific Databases||The Architecture of SciDB||Reading Response|
|11/27||No class (Thanksgiving)|
Because a major part of this class will be discussing and exploring active research areas, you must attend class. Your attendance is factored into the class participation part of your grade. All assignments should be completed on time. In the event that you will miss a lecture or cannot complete a required assignment on time--due to a serious and unavoidable circumstance such as illness--you must notify the instructor as far in advance as possible.
This course will require a significant semester-long project where students are expected to design, implement, and test a new technique for managing, analyzing, or applying provenance. Sample project ideas include the following:
For your project proposal, you should include the following:
You will be required to present the project proposal and send me a progress report during the semester. With the proposal, I will be able to provide feedback on the feasibility of the project. The progress report is required to make sure that you are making progress on the project before the final deadline. A sample progress report is available. For the final deadline, you will need to provide the finished project code and a final report. In addition, you will be required to present your results.
Throughout the course, students are required to read the assigned readings and provide responses to them. We will review a sample report in class so students will understand the requirements. The reports should include a summary of the key ideas in the paper, a critical reaction to the ideas, and at least one question about the technique(s). The readings posted on the calendar are due on the day they are listed. Please see the Response Example for an example and the format. Students should submit their reading responses by 12pm on the day of class by sending a text or PDF document to the instructor's email address so the instructor may review the questions for class discussion.
Whenever readings are assigned, a short reading quiz may also be given at the beginning of class to check that students are prepared to discuss the day's readings.
In addition, each student will be required to present one of the assigned readings in a short (15-20 minute) presentation to prepare the class for discussion. The presentation should cover the motivation for the paper, the problem being solved, the solutions to that problem, and the results. In addition, the presentation should be critical, highlighting shortcomings in addition to the advertised benefits. The instructor will provide a list of topics and students will rank the topics they would most like to present. Then, the instructor will assign topics based on those rankings. Students may not receive their first choice if that topic is popular; in case of overlap, ties will be broken randomly.
Except for changes that substantially affect the evaluation (grading) of the course, this syllabus is a guide for the course and is subject to change. Please refer to the current class web page for the most current information.
The incomplete policy for this course is that at least 70% of the course must be already completed and an exceptional circumstance (i.e. medical issue) must exist. If you feel you require an incomplete for an exceptional reason, you need to email me and state your reasons for the incomplete in writing. We will then decide on a course of action.
All UMass Dartmouth students are expected to maintain high standards of academic integrity and scholarly practice. The University does not tolerate academic dishonesty of any variety, whether as a result of failure to understand required academic and scholarly procedure or as an act of intentional dishonesty. A student found responsible of academic dishonesty is subject to severe disciplinary action which may include dismissal from the University. Refer to the Student Handbook and Student Code of Conduct for due process.
Students must complete their own work. They should not copy work from another source (e.g. another student, a book or other published document, or a website). If you use sources that are not you own, you must explicitly acknowledge them. In this course, the instructor reserves the right to use the SafeAssign plagiarism detection software through myCourses.
In accordance with University policy, if you have a documented disability and require accommodations to obtain equal access in this course, please meet with the instructor at the beginning of the semester and provide the appropriate paperwork from the Center for Access and Success. The necessary paperwork is obtained when you bring proper documentation to the Center, which is located in Liberal Arts, Room 016; phone: 508.999.8711.
You may not record lectures without the instructor's permission. Please do not cause distractions that detract from your fellow students' learning. Cell phones and other electronic devices should be silent; if there is an emergency and you need to communicate with someone, please step out of the classroom. You may use electronic devices for note-taking, but please note that not participating in lectures (e.g. working on another assignment during lecture) will affect the class participation portion of your grade.