Sometime, it is very useful to distribute some file cross nodes for a task. A classical case is JOIN with a small size metadata file. The file can be local or HDFS.
If you use this Java snippet: DistributedCache.addCacheFile(new URI(“/model/conf/txn_header”), conf); We assume the file is at HDFS, but actually it will look at local file system and generate java.io.FileNotFoundException if you use Eclipse plug-in (no problem with Hadoop command line). To solve this, please add Hadoop configuration into job config: