Hadoop implements a computational paradigm named MapReduce. If you have no idea what is MapReduce, read the Wikipedia article here.
Some Hadoop Related Terms:
- Hadoop: developed by Y!, a map-reduce implementation
- HDFS: Distributed file system written in Java for the Hadoop framework
- Pig: High level scripting language to work with Hadoop
- Hive: Data warehouse infrastructure to work with Hadoop, uses HiveQL (an SQL-like language)
- HBase: A non-relational key/value datastore to work with Hadoop
- Mahout: A set of machine learning algorithms to work with Hadoop on Big data
- Dryad
- Developed at Microsoft
- Tasks modeled as directed acyclic graph
- Sequential programs are connected using one-way channels
- S4
- Developed by Y!
- Stream processing
- Using Java Platform
- Spark
- Developed at UC Berekley
- In-memory queries, not just IO requests
- Implemented in Scala
- Needs a cluster manager (called Mesos)
- Storm
- Developed by Twitter
- Stream processing
- Guarantees message processing
- BashReduce
- works with Linux commands such as sort, grep
- Disco
- Developed at Nokia
- Backend is written in Erlang
- Works with Pyton
- developed at Nokia
- GraphLab
- Developed at CMU
- For machine learning tasks
- Data should fir in main memory
- Is not fault tolerant
- HPCC
- Uses Enterprise Control Language (ECL)
- In C++
No comments:
Post a Comment