Although recent performance has declined, there have been some technical problems, but Digg as a leading social newspaper site, behind the technology is still worth exploring, Digg engineer Dave Beckett's recent article entitled Built
one,
Lacoste Strap Trainers, Digg provides services
a social news site
for the individual it is a private community-based news distribution platform
an advertising platform
an open API platform
blog and file system
two, Digg's core functions
article submission feature to submit your ascertain expensive information
menu of function articles submitted information the consumer to do a list of alter latitudes (personal information, such for the recently loosened)
the operation of the article can perform various operations on the article, including reading, click, digg, remark, rate, and so for the comments
Top articles
Digg will sometimes feature some renowned articles Top to Digg home sheet, from the sheet so namely more folk can watch
3, Digg features after the implementation
First we see at a flowchart that describes the mean user in the use of specific modules Digg Digg during the operation:
In truth, this operation includes the navel two parts: simultaneous and asynchronous
immediate response to user synchronization: synchronization mainly to express user requests (including the API apply) real-time fast response, including some in the page at way of an asynchronous AJAX requests. These operations normally necessitate a second alternatively two the longest time to complete.
offline batch for asynchronous computing: In counting to real-time response to requests, periodically need to do some batch computing tasks that may be indirectly refreshed by the user,
Lacoste Trainers 2010, but users will not wait for these tasks. These are usually asynchronous computation may take a few seconds, minutes or even hours.
two portions of the application above Digg course this graph can be depicted with the following:
above is a common overview,
New Lacoste Trainers High Top, the emulating part we will work into the various features of Digg-depth learn.
1. online network system
provide Web page services and API services components It’s about time: PHP language for building front-end CMS system,
Lacoste Observe Strap Trainers, Python API to build waiter, they sprint on the Tornado. Thrift accession with them through the chief storage layer interaction, a lot of data will be such as Memcached and Redis memories storage system cache.
2. messaging system
Digg RabbitMQ accustom for queuing system ambition manipulate without synchronization reaction into the queue asynchronously.
3. asynchronous batch processing system
system is the upon message queue, and this refers to the specific queue is removed from the part of the task execution. This system will remove the job from the queue, then the calculation of a decisive operation on the basic storage for basic storage, operating in real-time systems and asynchronous systems are the same batch.
4. data storage layer
Digg
data storage layer using multiple products to achieve assorted missions,
Buy Lacoste, the specific list is as with:
Cassandra: such as articles, user, Digg operating records We use the Cassandra0.6 edition, as edition 0.6 did no hijack the secondary concordance, so we knob the data through the application layer and then use it for storage. For example, our data layer provides the user with user label and Email residence to interrogate user information interface.
HDFS: primarily used in enumeration of the log information storage and inquiry using the Hive operating Hadoop, MapReduce be calculated.
MogileFS: a distributed document storage system to store bin files, such as user avatars, screenshots, etc. Of lesson, there is a unified file storage on altitude of the CDN.
MySQL: At present, our article on Top feature some of the data using a MySQL storage, because this feature requires a lot of JOIN operation. At the same time HBase seems also a good consideration.
Redis: As Redis high-performance and flexible data building, we use it apt provide cache because Digg Streaming API, we too use Redis apt build real-time view and hit counter.
SOLR: used to build the full-text indexing system. To provide for the contents of the article, topics such as the full-text search.
Scribe: log collection system, more powerful than the syslog-ng easier. Use it to gather the logs will be examined and enumerated into the HDFS.
5. operating system and configuration
Digg runs on Debian stable based GNU / Linux servers which we configure with Clusto,
Lacoste Canvas Trainers, Puppet and using a configuration system over Zookeeper
translation link: http://blog.nosqlfan.com/html/1575.html
text links: http://about.digg.com/blog/how-digg-is-built