COMPUTATIONAL GENOMICS

International Teaching COMPUTATIONAL GENOMICS

Back

0623200010
DEPARTMENT OF INFORMATION AND ELECTRICAL ENGINEERING AND APPLIED MATHEMATICS
EQF7
INFORMATION ENGINEERING FOR DIGITAL MEDICINE
2024/2025



OBBLIGATORIO
YEAR OF COURSE 1
YEAR OF DIDACTIC SYSTEM 2022
SPRING SEMESTER
CFUHOURSACTIVITY
1COMPUTATIONAL GENOMICS
18LESSONS
216EXERCISES
2COMPUTATIONAL GENOMICS
216LESSONS
18LAB
Objectives
The course provides the basic knowledge for the analysis of data produced by Next Generation Sequencing platforms and the construction of pipelines for the analysis of such data.

Knowledge and understanding
Bioinformatics databases, structure, methods of access and consultation. Main problems for the analysis of omics data such as the alignment of sequences, the search for genes, mutations and variants, the correlation between genes, genome assembly. Characteristics of the main tools and platforms available on the market.

Applying knowledge and understanding
Access the main bioinformatics databases and use them in specific applications. Use the main tools and platforms for data analysis. Create pipelines for the analysis of different types of omics data such as DNA, RNA, mRNA, miRNA, referring to different species.
Prerequisites
In order to achieve the objectives of the course even not formally requested it is strongly recommended, that students have followed the courses of [0622900007] Elements Of Biology and [0622900008] elements of medical genetics and genomics.
Contents
DIDACTIC UNIT 1: INTRODUCTION TO BIOINFORMATICS AND COMPUTATIONAL GENOMICS PROBLEMS (LECTURE/PRACTICE/LABORATORY HOURS 2/0/0)

DIDACTIC UNIT 2: INTRO TO R PROGRAMMING LANGUAGE (LECTURE/PRACTICE/LABORATORY HOURS 2/2/0)
- 1 (2 HOURS LECTURE) R STUDIO ENVIRONMENT AND R DATA STRUCTURES
- 2 (2 HOURS PRACTICE) PACKAGE BIOCONDUCTOR
KNOWLEDGE AND UNDERSTANDING: R LANGUAGE SINTAX AND MAIN DATA STRUCTURES (LIST, VECTOR, ARRAY, MATRIX AND DATA FRAME)
APPLYING KNOWLEDGE AND UNDERSTANDING: BIOCONDUCTOR PACKAGE USAGE FOR GENOME SEQUENCES MANAGMENT AND ANALYSIS (DNA/RNA)

DIDACTIC UNIT 3: BIOLOGICAL AND BIOINFORMATICS DATABASES AND RESOURCES (LECTURE/PRACTICE/LABORATORY HOURS 2/6/0)
3 - (2 HOURS PRACTICE) GENOME SEQ DB (ESEMBLE GENE BANK) - PROTEIN SEQ DB
4 - (2 HOURS PRACTICE) BIOINFORMATIC RESOURCES AND GENE ONTOLOGY
5 - (2 HOURS PRACTICE) NEXT GENERATION SEQUENCE FILE FORMATS (FASTA, FASTQ, SAM/BAM, VCF)
6 - (2 HOURS PRACTICE) ESEMPI D'USO DELLE RISORSE BIOINFORMATICHE IN RETE E DEI DATABASES
KNOWLEDGE AND UNDERSTANDING: CHARACTERISTICS OF THE DIFFERENT DBS/REPOSITORIES AND POLICIES FOR ACCESS AND USAGE OF ALGORITHMS AND DATA
APPLYING KNOWLEDGE AND UNDERSTANDING: USE OF RESOURCES (DBS AND TOOLS) FOR SOLVING SIMPLE BIOINFORMATICS PROBLEMS (GENES SEARCH, GENE SEQUENCES EXTRACTION, ETC.)


DIDACTIC UNIT 4: INDEXING TECHNIQUES AND READ ALIGNMENT (LECTURE/PRACTICE/LABORATORY HOURS 10/0/0)
7 - (2 HOURS LECTURE) MOTIF SEARCH
8 - (2 HOURS LECTURE) INDEXING TECHNIQUES FOR READ ALLIGNMENT
9 - (2 HOURS LECTURE) BURROWS–WHEELER TRANSFORM AND FM INDEX
10 - (2 HOURS LECTURE) APROX MATCHING AND DYNAMIC PROGRAMMING FOR EDIT DISTANCE
11 - (2 HOURS LECTURE) DYNAMIC PROGRAMMING FOR LOCAL AND GLOBAL ALIGNMENT
KNOWLEDGE AND UNDERSTANDING: PROPERTIES OF THE SUFFIX ARRAY AND FM INDEX DATA STRUCTURES; CHARACTERISTICS OF THE ALGORITHMS FOR MOTIF SEARCH AND FOR THE LOCAL AND GLOBAL STRING ALIGNMENT
APPLYING KNOWLEDGE AND UNDERSTANDING: EVALUATION AND USE OF THE CHARACTERISTICS OF THE DIFFERENT DNA INDEXING DATA STRUCTURES AND OF THE STRING ALIGNMENT ALGORITHMS

DIDACTIC UNIT 5 - NEXT GENERATION SEQUENCING TECHNOLOGIES AND TOOLS (LECTURE/PRACTICE/LABORATORY HOURS 0/10/0)
12 - (2 HOURS PRACTICE) EXOME SEQUENCING
13 - (2 HOURS PRACTICE) TRANSCRIPTOMICS AND SINGLE CELL SEQUENCING
14 - (2 HOURS PRACTICE) METAGENOMICS
15 - (2 HOURS PRACTICE) FUNCTIONAL ANALYSIS - GENE ONTOLOGY ENRICHMENT ANALYSIS
16 - (2 HOURS PRACTICE) TOOLS, ENVIRONMENTS AND PIPELINES FOR NGS APPLICATIONS
KNOWLEDGE AND UNDERSTANDING: TECHNOLOGICAL AND FUNCTIONAL CHARACTERISTICS OF THE MAIN DNA/RNA SEQUENCING AND DATA ANALYSIS TECHNIQUES
APPLYING KNOWLEDGE AND UNDERSTANDING: USE OF SEQUENCING DATA AND TOOLS FOR BIONFORMATICS PIPELINES DESIGN


DIDACTIC UNIT 6 - GENOME ASSEMBLY ALGORITHMS AND GRAPHS (LECTURE/PRACTICE/LABORATORY HOURS 4/2/0)
17 - (2 HOURS LECTURE) GENOME ASSEMBLY AND OVERLAP GRAPH
18 - (2 HOURS LECTURE) GENOME ASSEMBLY AND DE BRUIJN GRAPH
19 - (2 HOURS PRACTICE) GENOME ASSEMBLY TOOLS USAGE
KNOWLEDGE AND UNDERSTANDING: GRAPH DATA STRUCTURES AND RELATED ALGORITHMS FOR SOLVING THE GENOME ASSEMBLY
APPLYING KNOWLEDGE AND UNDERSTANDING: EVALUATION OF THE FUNCTIONAL CHARACTERISTICS AND PERFORMANCE (CPU E MEMORY) OF GENOME ASSEMBLY TOOLS AVALIABLE IN LETTERATURE

DIDACTIC UNIT 7 - PROJECT WORK (ORE LECTURE/PROACTICE/LABORATORY 0/0/8)
DEVELOPING A PIPELINE FOR THE ANALYSIS OF DNA/RNA DATA

KNOWLEDGE AND UNDERSTANDING: ANALYSE THE PROBLEM IN TERMS OF TYPE OF DATA AND TOOLS TO USE FOR DESIGNING THE PIPELINE
APPLYING KNOWLEDGE AND UNDERSTANDING: DESIGN AND IMPLEMENT THE PIPELINE ED DISCUSS THE ACHIEVED RESULTS

TOTAL HOURS LECTURE/PRACTICE/LABORATORY 20/20/8

Teaching Methods
THE COURSE (48H OF LECTURES, PRACTICES AND LABORATORY ACTIVITIES) IS CHARACTERIZED BY A DYNAMIC SETTING, THAT INCLUDES THE ANALYSIS OF STUDY CASES WITH THE ACTIVE PARTICIPATION OF THE STUDENTS WHO WILL PERFORM SPECIFIC INSIGHTS ON THE USE OF NGS TECHNOLOGIES AND GENOME ANALYSIS TOOLS AND FRAMEWORKS DURING THE IMPLEMENTATION OF THE PROJECT WORK. IN PARTICULAR, THE TEACHING ACTIVITIES WILL INCLUDE LECTURES (20H), PRACTICES (20H) AND LABORATORY (8H) DEVOTED TO THE DEVELOPMENT OF THE PROJECT WORK. FOR THE DEVELOPMENT OF THE PROJECT STUDENTS WILL APPLY THEIR KNOWLEDGE IN ORDER TO, INDEPENDENTLY, CHOOSE THE MOST APPROPRIATE TECHNOLOGIES, FRAMEWORKS, TOOLS, TO SOLVE SPECIFIC PROBLEMS IN THE SELECTED APPLICATION DOMAINS. THE EDUCATIONAL ACTIVITIES WILL BE SUPPORTED BY THE USE OF THE DIEM E-LEARNING PLATFORM (HTTP://ELEARNING.DIEM.UNISA.IT) TO FACILITATE AND STIMULATE DISCUSSION AND DEBATE AMONG STUDENTS AS WELL AS FOR THE NOTIFICATION AND DISTRIBUTION OF TEACHING MATERIALS.
Verification of learning
THE FINAL EXAM IS DESIGNED TO ASSESS THE OVERALL KNOWLEDGE AND UNDERSTANDING OF THE CONCEPTS PRESENTED IN THE COURSE, THE ABILITY TO APPLY THAT KNOWLEDGE TO DEVELOP SPECIFIC APPLICATIONS AS WELL AS THE ABILITY TO COMMUNICATE AND PRESENT THE WORK CARRIED OUT (COMMUNICATION SKILLS).
THE EXAMINATION CONSISTS OF A PRACTICAL PART AND AN ORAL EXAM (INTERVIEW). THE PRACTICAL PART CONSISTS OF THE DEVELOPMENT OF A PROJECT WORK TO BE CARRIED OUT IN GROUPS (2-4 STUDENTS) ON THE PROPOSED TOPICS.
THE ORAL EXAM CONSISTS OF THE PRESENTATION OF WHAT HAS BEEN ACHIEVED DURING THE DEVELOPMENT OF THE PROJECT WORK.
EACH GROUP MEMBERS EXPOSE ITS OWN CONTRIBUTION FOR THE REALIZATION OF THE PROJECT TOGETHER WITH A DISCUSSION OF THE TOOLS AND FRAMEWORK USED FOR THE PIPELINE IMPLMENTATION AND THE ACHIEVED RESULTS. DURING THE ORAL EXAM WILL BE ALSO ASSESTED THE STUDENT KNOWLEDGE ABOUT THE TOPICS PRESENTED DURING THE COURSE.
IN THE FINAL EVALUATION, EXPRESSED WITH A MARK RANGE OF 30/30, THE PRACTICAL PART WILL WEIGH 65% AND THE ORAL EXAM FOR 35%. “HONOURS” (30/30 CUM LAUDE) WILL BE AWARDED TO STUDENTS WHO DEMONSTRATE A FULL MASTERY OF ALL THE MAIN METHODOLOGICAL AND TECHNOLOGICAL ASPECTS ADDRESSED IN THE COURSE AND HOW THEY CAN BE USED FOR THE CREATION OF APPLICATIONS AND SOLUTIONS IN DIFFERENT APPLICATION DOMAINS TOGETHER WITH THE IMPLICATIONS DERIVED FROM THEIR USE.
Texts
COURSE BOOKS
COMPUTATIONAL METHODS FOR NEXT GENERATION SEQUENCING DATA ANALYSIS (MANDOIU I AND ZELIKOVSKY A) (2016)
BIOINFORMATICS ALGORITHMS - AN ACTIVE LEARNING APPROACH (3RD EDITION - 2018) PHILLIP COMPEAU & PAVEL PEVZNER.
SUGGESTED BOOKS AND LEARNING MATERIAL
HTTPS://EN.WIKIBOOKS.ORG/WIKI/NEXT_GENERATION_SEQUENCING_%28NGS%29
NEXT-GENERATION SEQUENCING DATA ANALYSIS (XINKUN WANG) (2014)

BIOINFORMATICS: A PRACTICAL HANDBOOK OF NEXT GENERATION SEQUENCING AND ITS APPLICATIONS BY LLOYD LOW, MARTTI TAMMI 2017

SUPPLEMENTARY TEACHING MATERIAL WILL BE AVAILABLE ON THE UNIVERSITY E-LEARNING PLATFORM (HTTP://ELEARNING.UNISA.IT) ACCESSIBLE TO STUDENTS USING THEIR OWN UNIVERSITY CREDENTIALS.
More Information
The course is held in English.
Lessons Timetable

  BETA VERSION Data source ESSE3