Comments for AcadGild

Comment on File Formats in Apache HIVE by mohan

mohan — Thu, 25 Aug 2016 12:12:23 +0000

very useful blog..super..

Comment on Hive Use case – Counting Hashtags Using Hive by Esakki

Esakki — Tue, 23 Aug 2016 14:54:32 +0000

Hello guys ,
Am New to Hadoop and Hive …. Trying to play with twitter data and Hive … But Am getting Below Error …Could any one please help me to solve the issue……
Query Used :
CREATE EXTERNAL TABLE tweets (id BIGINT,entities STRUCT>) ROW FORMAT SERDE ‘com.cloudera.hive.serde.JSONSerDe’ LOCATION ‘/esak’;
create table hashtags as select id as id,entities.hashtags.text as words from tweets;
when i execute the second query am getting the below Error …
hive> create table hashtags as select id as id,entities.hashtags.text as words from tweets;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = esak_20160823195256_135b0bdf-3fbb-4f69-8402-6f910afd0e9e
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there’s no reduce operator
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:724)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:600)
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:94)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:95)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:190)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:433)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:138)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Job Submission failed with exception ‘org.apache.hadoop.io.nativeio.NativeIOException(No such file or directory)’
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. No such file or directory
hive>
please help me

Comment on MapReduce Use Case – Uber Data Analysis by Avinash

Avinash — Mon, 22 Aug 2016 06:18:10 +0000

i too got same error, removed header line from sample input text provided to get it work.

Comment on MapReduce Use Case – Uber Data Analysis by Avinash

Avinash — Mon, 22 Aug 2016 06:15:48 +0000

I think we are not finding correct solution, Problem stated is ”
In this problem statement, we will find the days on which each basement has more trips.” whereas solution provided is just giving sum of each day rather finding ON WHICH DAY maximum trip/active vehichle .
So it justs like another word count example.

Pls correct me

Comment on HealthCare Use Case With Apache Spark by villa

villa — Sun, 21 Aug 2016 01:03:10 +0000

I am interested in putting in big data analytics test in the health sector , but I need the data to test since you have 2 GB
adrianamaria25@hotmail.com

Comment on HealthCare Use Case With Apache Spark by villa

villa — Sun, 21 Aug 2016 01:00:58 +0000

I am interested in putting in big data analytics test in the health sector , but I need the data to test since you have 2 GB

Comment on Google Admob with Banner and Interstitial Ads in Android Apps by user tricky

user tricky — Sat, 20 Aug 2016 17:56:00 +0000

Can i get a examle of web view app with admob

Comment on Best Techniques to Optimize your Website for Better Performance by Ankit Mehta

Ankit Mehta — Sat, 20 Aug 2016 05:02:37 +0000

Nice Article!

Few more things to add from my experience. Most of the time when the application is developed we do not consider the advanced techniques like domain sharding , lazy load, DNS prefecth, minification and compression.

1. Domain Sharding:
All the browsers has a limitation on the number of connection to one domain , so it means if we push all the objects from the same domain the site will be slower. to make the site faster with you can use multiple domains to serve the contents. for example , for the images you can create a subdomain with images.yoursite.com , for js you can use js.yourdomain.com this will help to serve the contents faster with parallel loading.

Apart from that where ever it is possible to load the static objects from the cdn use it. for jquery you can use any of the CDN with the better performance (maxcdn, google code or jsdelivr).

2. Lazy Load:
Applying the lazyload mechanism will put some delay on the images loading and your site will have the visibility little quick. In other terms all the objects should not be loaded at the same time. Load only the objects that are required.

3. DNS Prefetch:
DNS prefect will help the site to reduce the DNS lookup time. Make sure to add all the domains / sub domains used in domain sharding are in the prefetch.

4. Minification:
Always use the minified version of the CSS and JS files for the site. Unminified files always creates a big impact on the site performance

5. Compression:
Do not limit the compression for the CSS, JS and HTML only. Use the compression for the images as well. Using the progressive image compression will help the site to be loaded faster.

Comment on Zookeeper Installation Guide by CarltonPickr

CarltonPickr — Fri, 19 Aug 2016 21:37:53 +0000

I see your website needs some fresh content. Writing manually is time consuming, but there
is tool for this task. Just search in gooogle for – Avurker’s essential tools

Comment on Front End Interview Preparation Guide by Irfan Khan

Irfan Khan — Thu, 18 Aug 2016 12:45:18 +0000

Very Informative and useful post.