domon大數據_求一篇與大數據或者大數據信息安全專業相關的原版英文文獻及其翻譯

1. 求一篇與大數據或者大數據信息安全專業相關的原版英文文獻及其翻譯,3000字左右。好人，拜託！

Big data refers to the huge volume of data that cannot
be stored and processed with in a time frame in
traditional file system.
The next question comes in mind is how big this data
needs to be in order to classify as a big data. There is a
lot of misconception in referring a term big data. We
usually refer a data to be big if its size is in gigabyte,
terabyte, Petabyte or Exabyte or anything larger than
this size. This does not define a big data completely.
Even a small amount of file can be referred to as a big
data depending upon the content is being used.
Let』s just take an example to make it clear. If we attach
a 100 MB file to an email, we cannot be able to do so.
As a email does not support an attachment of this size.
Therefore with respect to an email, this 100mb file
can be referred to as a big data. Similarly if we want to
process 1 TB of data in a given time frame, we cannot
do this with a traditional system since the resource
with it is not sufficient to accomplish this task.
As you are aware of various social sites such as
Facebook, twitter, Google+, LinkedIn or YouTube
contains data in huge amount. But as the users are
growing on these social sites, the storing and processing
the enormous data is becoming a challenging task.
Storing this data is important for various firms to
generate huge revenue which is not possible with a
traditional file system. Here is what Hadoop comes in
the existence.
Big Data simply means that huge amount
of structured, unstructured and semi-structured
data that has the ability to be processed for information. Now a days massive amount of data
proced because of growth in technology,
digitalization and by a variety of sources, including
business application transactions, videos, picture ,
electronic mails, social media, and so on. So to process
these data the big data concept is introced.
Structured data: a data that does have a proper format
associated to it known as structured data. For example
the data stored in database files or data stored in excel
sheets.
Semi-Structured Data: A data that does not have a
proper format associated to it known as structured data.
For example the data stored in mail files or in docx.
files.
Unstructured data: a data that does not have any format
associated to it known as structured data. For example
an image files, audio files and video files.
Big data is categorized into 3 v』s associated with it that
are as follows:[1]
Volume: It is the amount of data to be generated i.e.
in a huge quantity.
Velocity: It is the speed at which the data getting
generated.
Variety: It refers to the different kind data which is
generated.
A. Challenges Faced by Big Data
There are two main challenges faced by big data [2]
i. How to store and manage huge volume of data
efficiently.
ii. How do we process and extract valuable
information from huge volume data within a given
time frame.
These main challenges lead to the development of
hadoop framework.
Hadoop is an open source framework developed by
ck cutting in 2006 and managed by the apache
software foundation. Hadoop was named after yellow
toy elephant.
Hadoop was designed to store and process data
efficiently. Hadoop framework comprises of two main
components that are:
i. HDFS: It stands for Hadoop distributed file
system which takes care of storage of data within
hadoop cluster.
ii. MAPREDUCE: it takes care of a processing of a
data that is present in the HDFS.
Now let』s just have a look on Hadoop cluster:
Here in this there are two nodes that are Master Node
and slave node.
Master node is responsible for Name node and Job
Tracker demon. Here node is technical term used to
denote machine present in the cluster and demon is
the technical term used to show the background
processes running on a linux machine.
The slave node on the other hand is responsible for
running the data node and the task tracker demons.
The name node and data node are responsible for
storing and managing the data and commonly referred
to as storage node. Whereas the job tracker and task
tracker is responsible for processing and computing a
data and commonly known as Compute node.
Normally the name node and job tracker runs on a
single machine whereas a data node and task tracker
runs on different machines.
B. Features Of Hadoop:[3]
i. Cost effective system: It does not require any
special hardware. It simply can be implemented
in a common machine technically known as
commodity hardware.
ii. Large cluster of nodes: A hadoop system can
support a large number of nodes which provides
a huge storage and processing system.
iii. Parallel processing: a hadoop cluster provide the
accessibility to access and manage data parallel
which saves a lot of time.
iv. Distributed data: it takes care of splinting and
distributing of data across all nodes within a cluster
.it also replicates the data over the entire cluster.
v. Automatic failover management: once and AFM
is configured on a cluster, the admin needs not to
worry about the failed machine. Hadoop replicates
the configuration Here one of each data iscopied or replicated to the node in the same rack
and the hadoop take care of the internetworking
between two racks.
vi. Data locality optimization: This is the most
powerful thing of hadoop which make it the most
efficient feature. Here if a person requests for a
huge data which relies in some other place, the
machine will sends the code of that data and then
other person compiles it and use it in particular
as it saves a log to bandwidth
vii. Heterogeneous cluster: node or machine can be
of different vendor and can be working on
different flavor of operating systems.
viii. Scalability: in hadoop adding a machine or
removing a machine does not effect on a cluster.
Even the adding or removing the component of
machine does not.
C. Hadoop Architecture
Hadoop comprises of two components
i. HDFS
ii. MAPREDUCE
Hadoop distributes big data in several chunks and store
data in several nodes within a cluster which
significantly reces the time.
Hadoop replicates each part of data into each machine
that are present within the cluster.
The no. of copies replicated depends on the replication
factor. By default the replication factor is 3. Therefore
in this case there are 3 copies to each data on 3 different
machines。

reference:Mahajan, P., Gaba, G., & Chauhan, N. S. (2016). Big Data Security. IITM Journal of Management and IT, 7(1), 89-94.
自己拿去翻譯網站翻吧，不懂可以問

2. linux 請問如何讓它定時訪問某個網路地址

linux下的定時訪問可以使用corntab來實現
1、首先編輯corntab，添加如下命令
#每兩個小時

0 */2 * * * sometask.sh
上面的這段代碼是每兩個小時執行sometask.sh，這樣我們可以在sometask.sh里

實現訪問網路地址；
2、sometask.sh代碼
#!/bin/sh
curl -d "user=test&password=123456" www.some123.com

經過以上兩部就可以實現定時訪問了。

3. ISO體系文件分級

以ISO9001為例，來說明這個問題：

一級文件就是質量手冊；

二級文件就是程序文件，必須含有9001所要求的6個程序文件：文件控制、記錄控制、內部審核、不合格品控制、糾正措施、預防措施，糾正、預防措施的可以寫在一起，除這6個以外，你還可以根據公司實際情況增加其他一些。

三級文件就是支持性文件或管理制度之類了，比較多，包括作業指導書、檢驗規程、管理辦法、管理制度等等。

若是ISO14001體系的話，一級文件就是環境手冊了，ISO18001的話，一級就是職業健康安全管理手冊了。等等。

導航:首頁 > 網路數據 > domon大數據

domon大數據

與domon大數據相關的資料

友情鏈接