sign in · help · news · about · deen

BibSonomy ::  user :: pitman ::

The blue social bookmark and publication sharing system.
 

bookmarks

 (56)
<< < 1 | 2 | 3 > >> 
  • The Talis Connected Commons scheme is intended to directly support the publishing and reuse of Linked Data in the public domain by removing the costs assoc...
    The Talis Connected Commons scheme is intended to directly support the publishing and reuse of Linked Data in the public domain by removing the costs associated with those activities. The scheme is intended to support a wide range of different forms of data publishing. For example scientific researchers seeking to share their research data; dissemination of public domain data from a variety of different charitable, public sector or volunteer organizations; open data enthusiasts compiling data sets to be shared with the web community. For qualifying data sets, Talis will provide, through the Talis Platform: * Free hosting of up to 50 million RDF triples and 10Gb of content * Access to data access services that operate on that data, including data retrieval and text search * Free access to a public SPARQL endpoint for each dataset. This means that data set providers will not incur any of the commercial costs normally associated with hosting data on the Talis Platform. In addition neither the data set provider or its users will incur any usage charges relating to the use of the Platform services made available on that data. To qualify for entry into the scheme all data and content hosted in the Platform must be made available under one of the following public domain data licenses: * Open Data Commons Public Domain Dedication and License * Creative Commons CC0
    to commons data open by pitman and 3 other users on Apr 25, 2009, 11:45 PM
    (0)
  • Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. For more...
    Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. For more information about Tika, please see the list of supported document formats and the available documentation . You can find the latest release on the download page . See the Getting Started guide for instructions on how to start using Tika. Tika is a subproject of Apache Lucene . Lucene is a project of the Apache Software Foundation .
    to data extraction parsing structured by pitman on Apr 21, 2009, 12:18 AM
    (0)
  • Medvane is an automated bibliome mining system. Medvane's data source includes articles published since 1973 with at least one author from Harvard or its a...
    Medvane is an automated bibliome mining system. Medvane's data source includes articles published since 1973 with at least one author from Harvard or its affiliated institutions. Articles from PubMed are analyzed in the contexts of journal, author, subject, and gene. The relationship between these aspects and their evolution over time give a bird's eye view of biomedical research.
    to mining biblio Harvard data by pitman on Apr 16, 2009, 6:33 AM
    (0)
  • The Web is increasingly understood as a global information space consisting not just of linked documents, but also of linked data. More than just a vision,...
    The Web is increasingly understood as a global information space consisting not just of linked documents, but also of linked data. More than just a vision, the Web of Data has been brought into being by the maturing of the Semantic Web technology stack, and by the publication of an increasing number of datasets according to the principles of Linked Data. Today, this emerging Web of Data includes data sets as extensive and diverse as DBpedia, Geonames, US Census, EuroStat, MusicBrainz, BBC Programmes, Flickr, DBLP, PubMed, UniProt, FOAF, SIOC, OpenCyc, UMBEL and Yago. The availability of these and many other data sets has paved the way for an increasing number of applications that build on Linked Data, support services designed to reduce the complexity of integrating heterogeneous data from distributed sources, as well as new business opportunities for start-up companies in this space.
    to 2009 workshop April data web conference linked by pitman on Apr 1, 2009, 4:38 PM
    (0)
  • STW Thesaurus for Economics is now available under http://zbw.eu/stw. STW is a richly interconnected vocabulary in English and German on economics and...
    STW Thesaurus for Economics is now available under http://zbw.eu/stw. STW is a richly interconnected vocabulary in English and German on economics and business economics as well as some related subject areas. It includes subject categories and lots of synonyms in order to find the appropriate terms. Its publication aims at providing an interlinking hub for economics resources on the web of Linked Data. The thesaurus is maintained by the German National Library of Economics (ZBW) and published under a Creative Commons (by-nc-sa) license. It is delivered as XHTML+RDFa pages with an incremental search interface and a navigatable tree. A SKOS RDF/XML dump version can be downloaded, as well as a set of links to dbpedia concepts. More information about the design of the application can be found in a paper for the "Linked Data on the Web" workshop in Madrid (http://events.linkeddata.org/ldow2009/papers/ldow2009_paper7.pdf).
    to thesaurus data economics linked by pitman and 1 other user on Apr 1, 2009, 4:36 PM
    (0)
  • Clickstream Data Yields High-Resolution Maps of Science Johan Bollen1*, Herbert Van de Sompel1, Aric Hagberg2#, Luis Bettencourt2,3#, Ryan Chute1#, Mark...
    Clickstream Data Yields High-Resolution Maps of Science Johan Bollen1*, Herbert Van de Sompel1, Aric Hagberg2#, Luis Bettencourt2,3#, Ryan Chute1#, Marko A. Rodriguez2, Lyudmila Balakireva1 1 Digital Library Research and Prototyping Team, Research Library, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America, 2 Theoretical Division, Mathematical Modeling and Analysis Group, and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America, 3 Santa Fe Institute, Santa Fe, New Mexico, United States of America Abstract Background Intricate maps of science have been created from citation data to visualize the structure of scientific activity. However, most scientific publications are now accessed online. Scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations combined. Such log data is recorded immediately upon publication and keeps track of the sequences of user requests (clickstreams) that are issued by a variety of users across many different domains. Given these advantages of log datasets over citation data, we investigate whether they can produce high-resolution, more current maps of science.
    to data maps science by pitman on Mar 18, 2009, 1:33 AM
    (0)
  • to bibliographic code data by pitman and 2 other users on Feb 26, 2009, 2:43 AM
    (0)
  • The second worldwide review of the Functional Requirements for Authority Data was completed in mid-2007. The Working Group is completing work in a new draf...
    The second worldwide review of the Functional Requirements for Authority Data was completed in mid-2007. The Working Group is completing work in a new draft which incorporates comments received during the review. When complete, the draft will be submitted to Division IV for approval. In the meantime, the draft used in the second worldwide review remains available. Readers of the document should be aware that there will be many changes in the final version.
    to authority data functional library requirements by pitman on Feb 4, 2009, 7:22 PM
    (0)
  • to data grants by pitman and 1 other user on Jan 30, 2009, 2:54 AM
    (0)
  • Template for use in RePEc Here are some template for use by archive maintainers. Cut and paste them, or save them, and then fill out. Delete unnecessary l...
    Template for use in RePEc Here are some template for use by archive maintainers. Cut and paste them, or save them, and then fill out. Delete unnecessary lines, repeat clusters (like author, file) as a whole. Addition attributes are available, please see the ReDIF documentation for complete details. Attributes that go over one line should start with an empty space in all subsequent lines. Series templates need all to be in the same file (xxxseri.rdf). It does not matter whether individual templates are in separate files or not, as long as they are all in the same directory (one for each series, directory name with 6 characters).
    to RePEc data format templates by pitman on Jan 27, 2009, 3:23 AM
    (0)
  • to bibliography data fields metadata works by pitman on Jan 9, 2009, 1:03 AM
    (0)
  • R first appeared in 1996, when the statistics professors Robert Gentleman, left, and Ross Ihaka released the code as a free software package. * Sign...
    R first appeared in 1996, when the statistics professors Robert Gentleman, left, and Ross Ihaka released the code as a free software package. * Sign In to E-Mail or Save This * Print * Single Page * Reprints * Share o Linkedin o Digg o Facebook o Mixx o Yahoo! Buzz o Permalink Article Tools Sponsored By By ASHLEE VANCE Published: January 6, 2009
    to 2009 R analysis data news open software source statistics by pitman and 1 other user on Jan 8, 2009, 5:34 AM
    (0)
  • How is the indexing performed? A: Indexing is the process of creating a Conceptual Fingerprint from a text. In Collexis, this automated indexing mechanism...
    How is the indexing performed? A: Indexing is the process of creating a Conceptual Fingerprint from a text. In Collexis, this automated indexing mechanism performs the following steps on the text: removing the stop words, normalizing the text, selecting concepts by comparison with the thesaurus, clustering the concepts and attaching a relative weight to the concepts by means of a set of algorithms and measuring the specificity, similarity and frequency of the concepts. Back to Top Q: How does Collexis generate its search results? A: Collexis employs vector matching: comparing a search query with the Fingerprints from the records in a Collexion. The outcome is a very accurate and relevant list of content items and/or experts in the form of a list of records. There also exists the possibility of over-specifying a query (i.e., using a considerable piece of text), thus adding context to the query. This context will help the system to improve the accuracy of the query and return references to those content items that are contextually related. The system administrator can enlarge or reduce the set of returned documents by entering a threshold that indicates the minimum “distance” between the records returned and the query. Matching of a search query with Collexion records can be performed on multiple Collexions at the same time. Back to Top Q: What makes Collexis different? A: Initially, Collexis differentiates itself from full-text search engines by making use of thesauri for information retrieval. The high-quality search is based on semantics that have been defined in a thesaurus or ontology: synonymous terms and terms in different languages are linked to a single concept. Hierarchical relations between concepts, links between definitions and terms, and other semantic relationships are utilized in the search applications. This process helps to highlight those terms most relevant to the searcher’s query.
    to data indexing search text by pitman on Jan 4, 2009, 6:04 AM
    (0)
  • At the core of the platform lies a common data model to enable interoperability between the different components. In order to interact with the platform, a...
    At the core of the platform lies a common data model to enable interoperability between the different components. In order to interact with the platform, a component should know how to interact with at least a subset of the CDM. The model describes all the commonly used data that is dealt with in the platform, and therefore covers at least taxonomic names and concepts; literature references; authors; (type) specimen; structured descriptive data; and species related content of any kind like economic use or conservation status. Nearly all this data has already been described by existing or upcoming TDWG standards. Unfortunately, there are still major gaps in compatibility, so a new integrated data model has to be developed in order to quickly yield results.
    to data interoperability model by pitman on Dec 22, 2008, 9:26 PM
    (0)
  • Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is h...
    Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. An initial list of data sets is already available, and more will be added soon.
    to Amazon data public services web by pitman and 3 other users on Dec 16, 2008, 6:54 PM
    (0)
  • This table contains DML bibliographic items from various repositories. # # Coding is as follows: # ASCII based (ISO Latin 8859-1 extended) # Every line sta...
    This table contains DML bibliographic items from various repositories. # # Coding is as follows: # ASCII based (ISO Latin 8859-1 extended) # Every line starting with a '#' is a comment # # the list of items from any repository is preceded by lines like the following: # # nick: <repository nickname, usually short or acronym> # name: <repository name> # addr: <repository web address> # comm: <any comment concerning the actual repository # # After that, the bibliographic items of that repository are described by: # # item_title: <name or title of item> # item_years: <year(s) published or covered> # item_url: <web address of content page> # item_type: <journal|multivol|book> # (possibly other colon separated pairs, first component should begin with "item_") # item_end: <optionally some comment like a counting number...> # This last line ends any item entry. # # Some items do contain commented metadata for later use. # # comment lines like #--------------------------- or similar # could separate entries from different repositories
    to commercial data library list math web by pitman on Dec 16, 2008, 6:22 AM
    (0)
  • A Creative Commons license is inappropriate for cataloging records, precisely because they are unlikely to be copyrightable. The whole legal premise of Cre...
    A Creative Commons license is inappropriate for cataloging records, precisely because they are unlikely to be copyrightable. The whole legal premise of Creative Commons (and open source) licenses is that someone owns the copyright, and thus they have the right to license you to use it, and if you want a license, these are the terms. If you don’t own a copyright in the first place, there’s no way to license it under Creative Commons.
    to commons data library license by pitman and 2 other users on Dec 15, 2008, 6:44 PM
    (0)
  • to access data information open project by pitman on Dec 6, 2008, 8:10 PM
    (0)
  • GDataCopier provides a command line tool called 'gdoc-cp' that allows system administrators to automate bi-directional copy of documents & spreadsheets bet...
    GDataCopier provides a command line tool called 'gdoc-cp' that allows system administrators to automate bi-directional copy of documents & spreadsheets between local machines and Google document servers. It also presents Python programmers with an API to incorporate document/spreadsheet download and import features into their applications. GDataCopier requires the Google Data API to function.
    to Google copier data by pitman on Nov 30, 2008, 7:14 PM
    (0)
  • to Wikipedia data linked semantic_web by pitman and 7 other users on Nov 23, 2008, 7:21 PM
    (0)
<< < 1 | 2 | 3 > >>bookmarks per page: 5 10 20 50 100  

publications

 (2)
<< < 1 > >> 
<< < 1 > >>publications per page: 5 10 20 50 100  
a gripper