This post is part of Investigating Companies: A Do-It-Yourself Handbook. Read, download or purchase the whole book here.
If you have access to a computer and an internet connection, you’ll no doubt be using the web to find lots of the information you want about a company. It can be a very useful and convenient way to research, giving you access to an ever-expanding treasure trove of information, but it can also lead to hours of time wasted as you get further away from what you went on the web for in the first place.
Get off the web when you’ve found what you want and look for leads and sources but don’t aimlessly follow links. It’s always tempting to see how Robbie Williams and Gary Barlow are getting on, but clicking on that link is not going to help you find out how much the directors of the major oil companies are making.
See our resources section for important security issues to bear in mind while searching the web.
Software systems that ‘crawl’ the web for information that they then store in a database are known as search engines. They index the information they find, so the most relevant results for a particular search can be retrieved as quickly as possible – by key word, date, language and so on.
Helping people search the web has become extremely profitable and the owners of the best-known search engines – Google being the most obvious – are among the biggest companies in the world. The most noticeable sign of this commercialisation is the ever-increasing number of adverts that litter search results pages. Some of these look a lot like ‘genuine’ links. so be careful what you’re clicking on.
Type a word into a search engine and it can yield hundreds of thousands of pages so it’s good to narrow it down as much as possible. Search engines are getting better at working out what you’re looking for but it’s still useful to know how to get best results from your searches as quickly as possible. The most important advice is the most obvious: choose your words well. But you can save time by telling your search engine to search in a particular way, by typing search instructions together with the words you are searching for.
Some of the more useful search instructions include:
Putting two or more words in “quotation marks” is a way of telling the search engine to look for them together, in the order you’ve typed them, and not for all the times they occur separately. If you’re looking for a specific report, for example, put the title in quotation marks. Searching for “Investigating Companies: A Do-It-Yourself Handbook” would bring up this guide.
You can also use quotation marks to search for a quote in order to find its source.
Specifies the type of organisation whose website you want to search. Searching for ‘corporate lobbying site:org.uk’ for example, will give results on corporate lobbying from campaign groups and other organisations whose websites have the .org.uk domain name. site:gov.uk and site:ac.uk will do the same for UK government and educational institutions.
This can be very useful to get a non-corporate view of the world, as companies and corporate media sites spend lots of money on ‘search engine optimisation’ – i.e. getting their site higher up in search results – so when you look for a common term, the most popular results will often be from corporate sites.
You can also use this to search particular websites. Searching for ‘corporate lobbying site:www.bbc.co.uk‘ for example, gives you articles on the issue just from the BBC. Searching for corporate lobbying site:www.mining-journal.com would give you articles from the mining industry’s trade journal.
You can search websites from a particular country by using the country domain. ‘site:se’ for example, only gives you pages from Sweden.
Excludes words from a search. ‘Prince -William’ will give you results about the purple musician and the brand of tennis balls, but not the royal heir.
:DOC, :PPT, :PDF
Use if you’re looking for something you think may be in a particular type of file, such as a word document, powerpoint presentation or a pdf. ‘BP oil review:ppt’, for example, brings up powerpoint presentations by BP examining the oil market. Documents like this are sometimes not intended for the public and may contain useful or revealing information.
Helps you find similar websites, although search engines can make odd choices. Type in related:www.corporatewatch.org into Google and you’ll get the U.S. Chamber of Commerce (motto: “Standing Up for American Enterprise”).
Searches for particular words in a URL (web address). Intitle: does the same within the title of a webpage.
Some search engines allow you to specify the date a page or article was published (for example, within the last week, month, year). You can also specify the language you want the results to be in and which country they are from, among other things. Many search engines contain specialised databases that can give you a different set of results from their usual search engines. Most now have ‘News’ and ‘Images’ as options, and Google has patent and academic article searches, for example. These can narrow your results substantially.
Material on the web is always changing. Internet archive sites such as the Wayback Machine, which has been crawling the internet for almost twenty years and storing old versions of websites, can help you find previous versions of pages that have since been changed. Type the name of the website or page you’re interested in into the Wayback site’s search box and you’ll be able to find previous versions from a range of dates.
If the website you’re looking for has been completely deleted, you can search Google’s cache (store) of pages it has examined in the past to find the last active version (search “how to use google cache” for instructions on how to do this as it may depend on the browser you’re using).
SAVING WEB PAGES
The web is not static: information can move about, or even vanish completely. Companies who know they are being watched may deliberately take down material.
Use the ‘save page as’ function in your browser to save a copy of a webpage onto your hard drive, or print it off or take a screenshot (usually done by pressing the Prt Sc button on your keyboard). That way you ensure you have a record of it and you can access it at a later date.
CASE STUDY: Brighton-based arms company EDO MBM removed several pages from its website relating to the manufacture of a controversial bomb rack and arming unit shortly before its Managing Director was due to give evidence in court at the trial of campaigners arrested for a protest against the company. However, by using the Way Back Machine web archive, campaigners were able to recover the pages, making the director’s questioning even more uncomfortable than he had anticipated.