top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Hadoop: MiniDFSCluster setup

+2 votes
605 views

I read I can use MiniDFSCluster to set my own tests in case I modify hadop source code. I have built hadoop 2.2. However, I cant find any source on how to get the MiniDFSCluster working. Can someone point a link to me that helps?

posted Dec 14, 2013 by Sonu Jindal

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

+1 vote

If you search under hadoop-hdfs-project/hadoop-hdfs/src/test, you would see a lot of tests which use MiniDFSClustere.g.

cluster = new MiniDFSCluster.Builder(conf).numDataNodes(3).build(); hadoop-hdfs-project/hadoop-hdfs/src/test//java/org/apache/hadoop/hdfs/TestWriteRead.java
answer Dec 15, 2013 by anonymous
Running the test from Maven through commandline works fine. But  I am using eclipse. And it generates problem if I try to run the test as Junit, as if eclipse is not aware of any of the conf parameters or args. Can someone point to me a detailed source where it explains how to run Junit through Eclipse for hadoop 2.2.x?
You can use the following command to generate .project files for Eclipse (at the root of your workspace):
mvn clean package -DskipTests eclipse:eclipse

When you import hadoop, call sub-projects would be imported.
I was able to run TestWriteRead in Eclipse successfully.
Similar Questions
+3 votes

I have setup a HDP 2.3 cluster on Linux(CentOS). Now I am trying to utilize my ETL programs to access this cluster from a windows environment.
Should I setup Apache Hadoop on Windows local/server. What setup should I do ? What goes into the core-site.xml (mention my remote HDFS url ?/)
Any pointers would be helpful.

+3 votes
  1. How Hadoop provides Multi-tenancy using scheduler's or in simple terms "WHAT ARE THE STEPS TO CONFIGURE A MULTI-TENANT HADOOP CLUSTER?"
    Here multi-tenancy means different users can run there applications(similar/different) in a way such that each user is completely unaware of other and one user can't interfere with other user's data in hdfs such that data is secure and each user gets its fair proportion of resources to execute its applications in parallel.

  2. And is there any way to verify that cluster tenants are able to get their applications executed easily without any other intervention while keeping their data not secure and safe in hdfs?

0 votes

I have a small problem. I need to integrate Hadoop web interface with our web application . I just need an Hadoop interface where we can run some hadoop commands something like

1 cat hadoop dfs -cat prints the file contents
2 chgrp hadoop dfs -chgrp [-R] GROUP URI [URI …]
3 chmod hadoop dfs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI
4 hadoop dfsadmin -setSpaceQuota ********** /user/esammer
5 hadoop dfsadmin -report
6 copyFromLocal hadoop dfs -copyFromLocal URI

for this need an web interface. I already installed cloudera manager. I am using this Version: Cloudera Enterprise Data Hub Edition Trial 5.1.1 (#82 built by jenkins on 20140725-1608 git: cb9ebb729efc7929e1968b23dc6cf776086e20a7)

...