The Global Impact Study inventory was created to provide a baseline for the study’s investigation of the impacts of public access information and communication technologies (ICTs). The inventory has four main objectives:
- Help quantify the magnitude of the public access ICT phenomenon
- Serve as a frame from which to draw survey samples
- Facilitate rich analysis by making it possible to differentiate findings by type of establishment
- Serve as a major research output in its own right
The inventory is intended to capture data for venues that are currently operating, not those planned for the future or those that have already closed. A general requirement is that data be collected from existing administrative data sources (e.g., business registries) with a high degree of confidence.
Inventory data is made available to the public through the inventory database web application.
Inventory design and development
Development of the inventory design, data collection tools, and building of the database involved nine overlapping stages:
- Identification of relevant data and inventory components to collect. This stage involved looking forward to future project research activities and identifying what possible data these activities could benefit from, with the understanding that only high-level data from administrative sources would be collected.
- Formulation of taxonomy categories and definitions. This stage involved using the knowledge and research experience of project team members to assemble a list of categories that could be used to describe different types of venues.
- Preparation of data collection instruments and guidelines. This stage included the creation of instruction memos for data collection, the inventory database template, and the data dictionary.
- Feedback from local research teams. Spread out over approximately five months, this stage involved using a wiki and in-person discussions to gather feedback from the data collection teams on the feasibility and usefulness of the proposed inventory and taxonomy in each of their countries. They provided feedback on how the inventory could be most effective in collecting comprehensive data that capture country-specific context, while remaining comparable across countries. The data collection teams were also able to propose additional data that could be collected to better understand public access ICT venues within their country-specific contexts.
- Revision of inventory components, instrument, and guidelines. There were several iterations of the inventory and taxonomy that took into account feedback and suggestions from the data collection teams and other researchers on the project team.
- Development of an online database to store inventory data. To ensure data was collected and submitted in a consistent format, this stage began shortly after the inventory development process was underway. The database will be continually updated and adjusted to adapt the tool to changes in the project and new input on ways to access and visualize the data.
- Testing of data collection instruments. This stage lasted for approximately three months, in which data collection teams in each country collected and submitted a preliminary round of inventory data, along with feedback about the implementation process.
- Finalization of inventory data collection instrument. This stage incorporated lessons from the inventory testing stage to create the final inventory and data collection tool. The data collection tool will be accessible on this site in the near future.
- Data collection. Data collection teams in each country built on their preliminary inventories to submit more complete and verified versions.
The inventory contains a total of 64 discreet fields representing three major categories:
- Taxonomy fields
- Location and contact fields
- Comment and supplementary fields
A primary component of the inventory is a taxonomy that categorizes public access venues in a consistent way across countries. The taxonomy is composed of two distinct parts: a global taxonomy and a local taxonomy. The parameters for establishing both taxonomies were that the data to be collected should provide a high-level description of the venue and be solely obtainable from existing administrative data for all facilities with a high degree of confidence.
The global taxonomy is composed of a set of five fields that cannot be adjusted. These five categories were chosen after discussion among project members. Local data collection teams provided suggestions for categories they felt were important for describing public access ICT venues in their countries. To maintain the goal of collecting data uniformly across countries, the local teams’ suggestions were weighed against the ability of researchers in other countries to collect data for those categories. The resulting global taxonomy fields are:
2. Business mode
3. Internet access fee
3.3 Not applicable (for venues with no internet access)
4. Venue type
4.3 Stand-alone facility (i.e., telecenters and cybercafés)
4.4 Other public access location
4.4.1 Government building
4.4.2 Post office
4.4.3 Religious institution
Notes on global taxonomy categories
Ownership: This category relates to the legal description of the venue and not its source of funding. Non-private venues are categorized as “public” rather than “government” since the parameters of what constitutes a government sector is not the same across countries. In many countries governments have established agencies that are technically independent but nonetheless public entities. For example, an NGO might receive all of its funding from a government, but would still be categorized as NGO. Similarly, a government program might receive donated computers and connectivity from the private sector, but it is still a publicly-owned facility.
Internet Access Fee: “Internet Access Fee” was selected as the most appropriate taxonomy category to capture data on venue service charges. Different options such as “ICT usage fee” were considered but it was concluded that they would not yield useful or usable data, because of the range of pricing structures for different public access ICT services. The use of a category such as “hybrid fee structure” would not be useful as 1) it would require further breakdown of the category to understand which fees are free and which are paid and 2) getting to this level of granularity may not be possible under the taxonomy requirement that data only be acquired through administrative data sources.
Venue Type: The terms “telecenter” and “cybercafé” were deliberately omitted as sub-categories of venue type due to the existence of varying definitions of these types of venues. Instead, a generic designation of “stand-alone facility” was used to identify such venues. The Ownership and Business Mode taxonomy categories provide additional detail that enables one to distinguish between telecenter-type (public or NGO owned, not-for-profit) and cybercafé-type (private, for profit) venues. Although this is not a perfect solution, it allows for greater control in data collection without assigning a rigorous definition that may be inappropriate and/or difficult to implement in some countries. In instances where a venue is located within an entity that also has the potential to provide ICT access (for example, a cybercafé located within a library), the taxonomy uses the broadest description of the location. Thus for a cybercafé located within a library, the venue would be categorized as a library. The local taxonomy as well as the comments fields can be used to provide context for such instances.
Mobility: The incorporation of a mobility category accounts for venues that are mobile in nature, such as computer services delivered via boat or bus. The “mobile” category only applies to venues that are mobile in all their operations, not fixed venues that have a mobile component. For example, a fixed location library that sends buses to provide library service to surrounding communities would be classified as a stationary venue. The local taxonomy and comments fields can be used to provide context in such instances.
Urban/rural classification: Since the definition of a rural or urban area varies from country to country and source to source, the designation of a venue as rural or urban is based on research teams’ knowledge of local definitions. With the geo-coding of the inventory data, different definitions of rural and urban can be applied to the data in the future, e.g. based on population size or distance from a central location.
Unlike the global taxonomy, the local taxonomy is a flexible category and allows researchers to include data that they consider vital for high-level understanding of venues but that is unique to particular countries. These types of data will not be universally applicable and are thus not appropriate for the global taxonomy. As with the global taxonomy, it should be possible for local taxonomy data to be collected from existing sources with a high level of confidence and the data should be available for all venues in a particular country.
Location and contact fields
Geographic location and contact information was collected to 1) help pinpoint the venue location as accurately as possible for mapping and 2) to document necessary information to contact the venue for research purposes. The data collected include:
- Venue name (in the local language and translated into English).
- Venue start date, as well as venue close or future start dates. Data on venues that closed before or were expected to open after the inventory data was submitted were not collected with the same rigor as data for currently operating venues. However, where such data were available data collection teams were encouraged to submit them.
- Venue address information broken down by street name, building number, city, county, postal code, and any applicable regional units.
- Venue contact address if different than the physical address (for example, instances where the contact address is that of the program under which the venue is run).
- Direct contact information of the venue including phone, fax, email, VOIP, and website (these fields are considered private and will not be publicly available).
- Venue contact person’s role, address, phone number, and other contact information (these fields are considered private and will not be publicly available).
Comment and supplementary fields
Additional data fields in the inventory include the following: confirmation of the presence of ICTs at the venue, other venue information including programs in which the venues may belong, source of data and last data verification date, and comments/notes.
The inventory data stored in this database was collected by local research teams in Bangladesh, Brazil, Chile, Ghana, Lithuania, and the Philippines, under the direction of TASCHA. Data collection teams were provided with an Excel spreadsheet template and instructions for data input, which they submitted to TASCHA. The submitted data was then transferred onto the database platform. Download the Excel spreadsheet template and the instructions for data input.
Since an underlying requirement in the inventory was that all data must be collected from administrative sources without additional primary research or fieldwork, it was impractical to expect all data fields to be populated for each venue. However, while data collection teams were highly encouraged to populate as many fields as possible, certain fields were deemed important enough to be required fields. The 12 required fields were:
- Venue name
- Venue name in English
- Venue street address
- Venue city
- Venue country
- Presence of ICTs at the venue
- Source of information
- Venue ownership
- Business mode
- Internet access fee
- Venue type
To register for access and use the application, visit the site at http://database.globalimpactstudy.org/.
Updating the database
The dynamic nature of public access to ICT venues means that they are notoriously hard to track, as data quickly becomes outdated. In order to maintain a dataset that is as up-to-date as possible, and to take advantage of the opportunity to monitor venue birth and death cycles, data collection teams have updated their inventories annually for the duration of the project. The final update was submitted at the end of 2011.
Inventory database source code
The source code for the inventory database is available for other developers to use and adapt.
The web application is written in PHP, using the Symfony framework, with a MySQL database back-end. Accompanying the source code is basic documentation describing the various components of the system. The extent of documentation is intended for individuals with experience deploying these types of environments. Please note that development of the Global Impact Study inventory database has ended. Although most bugs have been addressed, there is the possibility that some remain. As such, please treat the code as beta. This is open-source software licensed under the terms of the GNU General Public License v. 3.
The data in this inventory is limited by the following factors:
- Dependence on administrative data sources
- Limited data on cybercafés due to low official registration levels
- Limited data disaggregating libraries with public access computers from those without public access computers
- High turnover of public access ICT venues, especially cybercafés
- Project definition of public access ICTs
Based on their local knowledge, data collection teams provided estimations of their level of confidence in the extent to which the data they submitted represents the totality of public access ICT venues in their country. Estimates range from 50-100% for telecenters, 3-40% for cybercafés, and 5-99% for libraries with public access computers. Any analysis of the inventory data, particularly if it involves assessments of the relative quantities of different venue types, should take this into consideration.