1 mile = 1.609344 kilometres
The Flickr API will disable your key if you query too rapidly, so it makes sense to do large queries which return hundreds of results.
The image download code is written in Matlab. It accesses images on Flickr’s http server instead of going through the API, and thus doesn’t require an API key.
The image download code is written in Matlab. It accesses images on Flickr’s http server instead of going through the API, and thus doesn’t require an API key. It reads the text files produced by get_imgs_geo_gps_search.py, downloads the photo, and saves all of the image attributes (tags, interestingness, long/lat, etc…) as a matlab cell string array in the comment field of each jpg. Use imfinfo() to read them later.
a) What size images will this get?
Currently the code will try and find the Flickr “Large” size photo, which has max width or height of 1024. Failing that it will try to get the “Original” size photo. If the “Original” is larger than 1024 height/width it will be downsampled to 1024. If it is smaller than 500 height or width it will be thrown away. Otherwise the image will be kept.
A significant fraction of images are too small by this criteria and thus are thrown away. An alternative strategy would be to download only the default size images, which will always be available although somewhat small.
b) How are the images written to disk?
Since most file systems have trouble with thousands of files in a directory, the images are put into a hierarchy of directories that contain no more than 1000 images each. The hierarchy is
base_db_path / keyword / numbered subdir / img_name
The image filenames contain the photo id, secret, server id, and owner which can be used to trace the .jpg back to its source on Flickr. See the source code for examples of how the URLs are constructed.
c) Can I run the download script in parallel?
Yes, I’ve run 15 copies in parallel in the past. I wouldn’t recommend doing any more than this because Flickr could get mad at us. They’re aware that researchers are using Flickr as a data source but their main concern is that we don’t impact the quality of service for the millions of people who use Flickr.
To run multiple scripts in parallel you’ll need to split up the text files from the query process manually, then change the path in downloadphotos_int.m for each call.
d) What about copyrights?
It is worth noting that Flickr allows photographers to specify Creative Commons licenses for their images instead of the default “all rights reserved”. This script saves the license info with each .jpg file, so you can pick out Creative Commons images after the fact (in my experience it’s less than 10% of images) It is also possible to restrict the search to images with certain licenses at query time. See the Flickr API for details.
From this post, Let’s look at using a library called flickrpy available freely. Download the file flickr.py. You will need an API Key from Flickr to get this to work. Keys are free for non-commercial use. Just click the link”Apply for a new API Key” on the Flickr API page and follow the instructions.
Once you have an API key, open flickr.py and replace the empty string on the line
API_KEY = ”
with your key. It should look something like this:
API_KEY = ‘123fbbb81441231123cgg5b123d92123’
Let’s create a simple command line tool that downloads images tagged with a particular tag. Add the following code to a new file called tagdownload.py.
import flickr import urllib, urlparse import os import sys if len(sys.argv)>1: tag = sys.argv else: print 'no tag specified' # downloading image data f = flickr.photos_search(tags=tag) urllist =  #store a list of what was downloaded # downloading images for k in f: url = k.getURL(size='Medium', urlType='source') urllist.append(url) image = urllib.URLopener() image.retrieve(url, os.path.basename(urlparse.urlparse(url).path)) print 'downloading:', url
If you also want to write the list of urls to a text file, add the following lines at the end.
# write the list of urls to file fl = open('urllist.txt', 'w') for url in urllist: fl.write(url+'\n') fl.close()
From the command line, just type
$ python tagdownload.py goldengatebridge
and you will get the 100 latest images tagged with “goldengatebridge”. As you can see, we chose to take the “Medium” size. If you want thumbnails or full size originals or something else, there are many sizes available, check the documentation on the Flickr website.