International Teaching | INFORMATION SYSTEMS FOR BIG DATA
International Teaching INFORMATION SYSTEMS FOR BIG DATA
Back
cod. 0222700009
INFORMATION SYSTEMS FOR BIG DATA
0222700009 | |
DEPARTMENT OF MANAGEMENT & INNOVATION SYSTEMS | |
EQF7 | |
DATA SCIENCE AND INNOVATION MANAGEMENT | |
2022/2023 |
YEAR OF COURSE 2 | |
YEAR OF DIDACTIC SYSTEM 2020 | |
AUTUMN SEMESTER |
SSD | CFU | HOURS | ACTIVITY | |
---|---|---|---|---|
INF/01 | 3 | 21 | LESSONS | |
INF/01 | 3 | 21 | LAB |
Objectives | |
---|---|
THE COURSE AIMS TO INTRODUCE FUNDAMENTAL CONCEPTS, REQUIREMENTS, TECHNOLOGIES, AND REFERENCE ARCHITECTURES FOR DEFINING AND IMPLEMENTING BIG DATA-ORIENTED INFORMATION SYSTEMS. SKILLS WILL BE LEARNED BY STUDYING EXISTING TECHNOLOGICAL FRAMEWORKS FOR ACQUISITION; STORAGE THROUGH NOSQL-DB (SOLR, MONGODB, NEO4J, ETC.) AND FORMATS FOR BIG DATA FILES (AVRO, PARQUET, ETC.); AND DISTRIBUTED PROCESSING, BOTH IN BATCH AND STREAM MODE (HADOOP, SPARK, ETC.), WITH THE AIMING OF CALCULATING ANALYTICS FROM UNSTRUCTURED OR SEMI-STRUCTURED RESOURCES, IN A SCALABLE MANNER. IT WILL BE PROVIDED AN INTRODUCTION TO WEB APPLICATIONS FOR ANALYTICS VISUALIZATION, INCLUDING D3.JS AND TECHNOLOGY STACKS SUCH AS APACHE SOLR+BANANA AND ELASTICSEARCH+KIBANA. AT THE END OF THE COURSE, THE STUDENT WILL BE ABLE TO USE THE MAIN TECHNOLOGICAL TOOLS FOR ACQUIRING, STORING, PROCESSING, AND ANALYZING BIG DATA. FURTHERMORE, THE STUDENT WILL BE ENCOURAGED TO CARRY OUT GROUP WORK AND APPLY THE ACQUIRED KNOWLEDGE TO IMPLEMENT A PROJECT EXHIBITING BIG DATA ANALYTICS FUNCTIONALITIES IN A CHOSEN FIELD (E.G., SOCIAL MEDIA, WEB INTELLIGENCE, SMART ENVIRONMENT, ETC.). THE OBJECTIVE CONSISTS IN EXERCISING THE ABILITY TO SELECT AND ADOPT SUITABLE TECHNOLOGIES DEPENDING ON HETEROGENEOUS REQUIREMENTS COMING FROM THE PROJECT CONTEXT. |
Prerequisites | |
---|---|
IT IS DESIRABLE THAT STUDENTS KNOW: THE BASIC CONCEPTS OF ALGORITHMS AND DATA STRUCTURES; AT LEAST A PROGRAMMING LANGUAGE AMONG JAVA, PYTHON, SCALA, TO WRITE SIMPLE PROGRAMS; THE BASICS OF DATABASES AND SQL. |
Contents | |
---|---|
AFTER A BRIEF INTRODUCTION TO THE MAIN LEARNING OBJECTIVES OF THE COURSE, STUDENTS WILL BE INTRODUCED TO THE BIG DATA WORLD. IN THE EARLY PART OF THE COURSE, THE STUDENTS WILL BE ENCOURAGED TO WORK IN TEAM DEFINING A PROJECT WORK IN WHICH APPLY THE KNOWLEDGE ACQUIRED DURING THE CLASSES FOLLOWING A STEP-BY-STEP APPROACH. THE COURSE WILL BE COMPOSED OF THE FOLLOWING MAIN PARTS. (4 HOURS) INTRODUCTION TO BIGDATA-ENABLED ARCHITECTURES BIGDATA LANDSCAPE REQUIREMENTS OF BIGDATA INFORMATION SYSTEM LAMBDA AND KAPPA ARCHITECTURE (4 HOURS, ONE OF WHICH ARE LABORATORY ACTIVITIES) ACQUISITION SERIALIZATION AND EXCHANGE DATA FORMATS: JSON, AVRO, PARQUET, ETC. REST AND STREAM API FOR ACCESSING TWITTER, DROPBOX, ETC. (10 HOURS, SEVEN OF WHICH ARE LABORATORY ACTIVITIES) DISTRIBUTED PROCESSING HADOOP AND RELATED TECHNOLOGIES. SPARK, AND OTHER BIG DATA PROCESSING ENGINES. HANDS ON SPARK DATAFRAME HANDS ON SPARK MACHINE LEARNING (10 HOURS, SEVEN OF WHICH ARE LABORATORY ACTIVITIES) STORAGE INTRODUCTION TO NOSQL DATABASE, SUCH AS KEY-VALUE STORE, DOCUMENT-ORIENTED DATABASE, COLUMN-ORIENTED AND GRAPH DB. HANDS ON MONGODB HANDS ON NEO4J (10 HOURS, FOUR OF WHICH ARE LABORATORY ACTIVITIES) DISTRIBUTED STREAM PROCESSING INTRODUCTION TO DISTRIBUTED DATA STREAM STREAM PROCESSING. APACHE STORM, SPARK STREAMING, KAFKA STREAMS HANDS ON SPARK STREAMING HANDS ON KAFKA STREAMS (4 HOURS, TWO OF WHICH ARE LABORATORY ACTIVITIES) BIG DATA ANALYTICS INTRODUCTION TO ANALYTICS VISUALIZATION THROUGH A WEB APPLICATION CONSIDERING D3.JS AND THE MOST USED TECHNOLOGICAL STACKS: APACHE SOLR AND BANANA, ELASTICSEARCH AND KIBANA HANDS ON APACHE SOLR AND BANANA |
Teaching Methods | |
---|---|
THE COURSE AIMS TO ENCOURAGE STUDENTS TO THE LIFELONG LEARNING PROCESS, WHICH INVOLVES THE CONTINUOUS UPDATING (THROUGHOUT LIFE) OF KNOWLEDGE AND SKILLS, TRYING TO STIMULATE CURIOSITY AND INTEREST IN INFORMATION TECHNOLOGY AND NEW TECHNOLOGIES ATTAINING WITH THE MATTER OF THE COURSE. IN ORDER TO GET THEM USED TO SELF-LEARNING, STUDENTS WILL BE INVITED TO DEEPEN THE TOPICS OF THE COURSE BY OFFERING THEM ACCESS TO ONLINE RESOURCES OF PARTICULAR INTEREST. DURING THE COURSE THE TEACHER WILL MAKE AMPLE USE OF EXAMPLES, GUIDED EXERCISES. FROM A STRUCTURAL POINT OF VIEW, THE LESSONS WILL CONSIST OF (21 HOURS) FRONTAL LESSONS. (21 HOURS) LABORATORY ACTIVITIES. |
Verification of learning | |
---|---|
THE ACHIEVEMENT OF THE TEACHING OBJECTIVES IS CERTIFIED BY PASSING AN EXAM WHOSE EVALUATION IS IN THIRTIETHS. THE EXAM IS DIVIDED INTO TWO PARTS: A "THEORETICAL" AND A "PRACTICE" TEST. IN ORDER TO PASS THE WHOLE EXAM, EACH PART MUST BE PASSED WITH, AT LEAST, A SUFFICIENT EVALUATION. OTHERWISE, THE EXAM IS CONSIDERED NOT PASSED. THE FINAL VOTE (IF THE SUFFICIENCY IS REACHED FOR EACH PART) IS GIVEN BY THE SUM OF THE VOTES OF THE TWO PARTIES. FIRST PART: THE "THEORETICAL" ASSESSMENT CONSISTS OF A STUDENT'S PRESENTATION ABOUT A TOPIC OF INTEREST (PERTINENT TO THE COURSE) FROM THE TECHNOLOGICAL, METHODOLOGICAL AND/OR APPLICATIVE POINT OF VIEW (THROUGH A RESEARCH CARRIED OUT INDIVIDUALLY AND CRITICALLY WITH APPROPRIATE CONNECTIONS AND PARALLELISMS WITH THE THEMES STUDIED DURING THE COURSE); SECOND PART: THE “PRACTICAL” ASSESSMENT REGARDS A PROJECT CARRIED OUT BY THE STUDENT WITHIN A TEAM, WHICH AIMS TO USE SOME TECHNOLOGIES STUDIED DURING THE COURSE AND/OR THOSE THAT EMERGED BY THE INDIVIDUAL RESEARCH. |
Texts | |
---|---|
MARZ, N., & WARREN, J. (2015). BIG DATA: PRINCIPLES AND BEST PRACTICES OF SCALABLE REAL-TIME DATA SYSTEMS. NEW YORK; MANNING PUBLICATIONS CO. SUGGESTED READINGS: BAHGA, ARSHDEEP, AND VIJAY MADISETTI. BIG DATA SCIENCE & ANALYTICS: A HANDS-ON APPROACH. VPT, 2016. |
More Information | |
---|---|
LINKS TO ADDITIONAL MATERIAL AND TEACHING MATERIALS WILL BE PROVIDED. |
BETA VERSION Data source ESSE3