RussianPatents.com

Method and device for subscribing to information from web page

Method and device for subscribing to information from web page
IPC classes for russian patent Method and device for subscribing to information from web page (RU 2510921):
Another patents in same IPC classes:
Method and device for system used for forecasting of group trade Method and device for system used for forecasting of group trade / 2510891
Invention relates to detection of patterns in pay card transaction data to define the seller affiliation group in said data. Proposed method comprises memorising of transaction data in data base, transaction data sampling by first computer connected with data base, use of at least one algorithm of forecasting and selected transaction data for forecasting multiple group seller affiliation in group of sellers. Note here that algorithm is executed by first computer. It comprises generation of metadata describing every forecast outputted by at least one said algorithm, input of multiple forecast group affiliations for seller and metadata describing every forecast into data analysis program to be executed at second computer. Second computer is used to assign confidence factor to every forecast group affiliation with the help of data analysis program. Said group affiliation is based on at least partially forecast group affiliation and metadata. Note here that confidence factor represents a probability of actual association of the seller with appropriate forecast group affiliation. Second computer is used to forecast the group affiliation with the highest confidence factor as the final forecast of seller affiliation.
Information processing device, information processing method and image forming device Information processing device, information processing method and image forming device / 2509353
Information processing device includes a receiving unit capable of receiving a request to use one of a plurality of services provided by a computer; a service attribute information storage unit capable of storing service attribute information for each of the services, wherein the service attribute information includes information which indicates the type of service and information which indicates a service content estimate; and a service recommendation unit capable of searching, in the service attribute information storage unit, service attribute information corresponding to the type of the requested service, and determine the service recommended for use based on estimate information in the service attribute information retrieved from the service attribute information storage unit.
Method and apparatus for classifying content Method and apparatus for classifying content / 2509352
Method of classifying content involves deciding to receive, along with first content to be displayed on user equipment, a vector having one or more values for one or more corresponding scales. Further, the method involves determining the relationship between the first content and second content based on said vector. Said vector is a position on corresponding one or more scales between the minimum mark and the maximum mark of the scale. Said position on corresponding one or more scales between the minimum and maximum marks of the scale depends on the result of operation of the logic of reading context which monitors and processes information from a sensor.
Method and apparatus for displaying navigation content Method and apparatus for displaying navigation content / 2509351
Device includes an obtaining unit designed to obtain website pages, an identification unit configured to browse in reverse order all page marks, mark these page marks, a miniature display unit designed for miniature display of navigation content identified by the identification unit.
Page-by-page breakdown of hierarchical data Page-by-page breakdown of hierarchical data / 2507574
Disclosed is an architecture which facilitates intelligent page-by-page breakdown (splitting) of hierarchical data sets using hierarchical presentation operations. The architecture further enables to add to/remove from hierarchical presentations and manage "parent/child" relationships of data set records without sending the whole set of records to a client and without receiving the whole set of records back at the server, thereby optimising efficiency of operations over hierarchical data sets.
Information search method (versions) and computer system for realising said method Information search method (versions) and computer system for realising said method / 2506636
Invention relates to function-oriented search in databases containing materials classified by headers of one of classification systems, e.g., international patent classification, and can be used for different information tasks, particularly for searching for engineering solutions in different areas of science and engineering. In response to a user-defined task in form of a word description of a function, which includes at least a word which characterises an action and/or a word which describes a parameter, change of which characterises the result of said action, for searching for information on the possible execution of which information search is performed, the system provides word descriptions of functional analogues, and in response to their selection by the user, presents the header of a classification system on which information search can be performed thereafter. Headers are defined by a modified classification system, obtained in advance by adding to the headers of the classification system a field containing at least one word which characterises an action and/or a word which describes a parameter, change of which characterises the result of that action, corresponding to the sense of the content of the header of the classification system, wherein the header is defined based on match of at least one of said words contained in the additional field of the modified classification system with at least one word from the word description of selected functional analogues.
Method and apparatus for processing page resources Method and apparatus for processing page resources / 2504832
Method includes: identifying specific resources related to web pages, and determining corresponding relationships between the identified specific resources and the web pages; according to the corresponding relationships between the specific resources and the web pages, displaying prompt sign indicating the web pages having the specific resources.
Search index format optimisation Search index format optimisation / 2503058
Identification index for documents containing keywords is used. The index contains an encoded delta index list of the document identifier (ID), wherein the delta index list of the document ID contains a plurality of records, each record using a symbol to represent the value of the delta index of the document ID for each document from a plurality of documents in a search area containing a keyword. Each of the symbols of the delta index list of the document ID is compared with one category from a finite set of categories and with the index in each category from the finite set of categories. Each category contains a basic value and each symbol in the delta index list of the document ID is the sum of the basic value for the category compared with it and the value of the delta index of the document ID represented by said symbol.
Information processing device, information processing method, program and information processing system Information processing device, information processing method, program and information processing system / 2503057
In a case where a link to jump to a destination page is included within a source page, when movement information indicating that the link is moved to a predetermined position while being selected, is input is via an input unit, a control unit determines whether or not the link is moved to a predetermined region within a display surface based on the movement information and region information, and, when it is determined that the link is moved to the predetermined region within the display surface, the control unit acquires page information from a storage unit and causes a display unit to execute processing of displaying on the source page. The page analysis result is associated with the region information for specifying the predetermined region of a movement destination of the link.
Display control apparatus, display control method Display control apparatus, display control method / 2503054
Display control apparatus according to the present invention successively displays each of a predetermined number of target display images from a plurality of candidate images to be displayed on a display screen such that the target display images are arranged on the display screen in an order in accordance with a predetermined order of a plurality of images. The display control apparatus determines when a predetermined number of images is set as new target display images in accordance with a predetermined order, the order of displaying the predetermined number of images such that each of the predetermined number of images is arranged in an order in accordance with the predetermined order for successive display of images according to the display order, different from the predetermined order.
/ 2245578
/ 2245579
/ 2246130
/ 2246756
/ 2247424
/ 2248040
/ 2250492
/ 2250493
/ 2251149
/ 2251727

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to processing devices. The device includes an identification module for identifying a web page block to which a user subscribes using a first Document Object Model (DOM) tree of the web page to obtain identification information, a real-time monitoring module for extracting and storing URL addresses of all links in the web page block to which the user subscribes, and monitoring URL addresses in the block according to the identification information and storing URL addresses to determine if there is any change in URL addresses, a display module for displaying the web page corresponding to the changed URL address, if there is any change in the URL addresses of the block of the web page to which the user subscribes.

EFFECT: enabling subscription to any block of web page content and reducing the amount of required service resources provided by a provider.

23 cl, 3 tbl, 8 dwg

 

The SCOPE of the INVENTION

[0001] the Present invention relates to the field of information processing in the Internet, in particular to a method and device subscription information from a web page.

PREREQUISITES TO the CREATION of INVENTIONS

[0002] With the development of the Internet most users began to receive news from the Internet. When using the original method of obtaining information for the user to obtain the necessary information required to open a web page, one after the other. To simplify their actions, the user can subscribe to information from the web site. When viewing a web page, the user may be interested in only some of its content. Web slices provided in IE 8.0, can provide a subscription to some content of the web page.

[0003] To enable the subscription information using web slices in the HTML code of the web page added special identifiers to identify a block of content on a web page. These IDs web slices allow you to subscribe for the corresponding block of the web page.

[0004] the Author of the present invention found the following deficiencies web slices.

[0005] first, the web slices allow you to subscribe to content only with special identifiers, and not on any block in the EB-page.

[0006] secondly, because in the HTML code of the web page is requested to insert the identifiers, the content provider web site should allocate additional service resources.

BRIEF description of the INVENTION

[0007] implementations of the present invention offers a method and device subscription information from a web page to provide the ability to subscribe to any content block on the web page and reduce the number of service resources provided by the content provider, or release the content provider from the necessity of providing service resources associated with a subscription.

[0008] In accordance with the implementation of the present invention proposes a method of subscription information from a web page. This method includes the following steps:

unit identification of the web page to which the user subscribes, using her first tree Document Object Model OMD (DOM) to obtain identification information;

extracting URLS of all links in the web page to which the user subscribes, and tracking in real-time URL in a block according to the identification information and the stored URLS to determine whether there is any change in the stored URL;

the display of the web pages that are appropriate to estoya modified URL, if the URL-address block of the web page to which the user subscribes, there is any change.

[0009] In accordance with another implementation of the present invention a device subscription information from a web page. This device contains the following modules:

an identification module to identify a block of a web page, which is signed by the user, with the help of her first tree Document Object Model OMD (DOM) to obtain identification information;

the tracking module in real-time to extract and save the URLS of all links in the web page to which the user subscribes, and tracking URLS in the block according to the identification information and the stored URLS to determine whether there is any change in the URL;

a displaying module for displaying the web page corresponding to the changed URL, if the URL-address block of the web page to which the user subscribes, there is any change.

[0010] implementations of the present invention block the web page to which the user subscribes, is identified with the help of her tree OMD (DOM) to obtain identification information. URLS in this block is retrieved and stored. The URL in the web page on which the signed is moved by the user, tracked in real time according to the identification information and the stored URLS to determine whether there is any change in the URL. The web page corresponding to the changed URL is displayed. Because in the web page can be automatically identified any block of content, the content provider is not required in advance to identify the content of the web page. Thus, it is possible to subscribe to any content block on the web page, and the number of service resources provided by the content provider, decreases. In addition, the block of the web page, which is signed by the user may be determined and displayed on the web page by highlighting a special background color. As a result, the user experience is improved.

A BRIEF description of the IMAGE

[0011] figure 1 shows a block diagram of a method of subscription information from a web page according to the first implementation of the present invention.

[0012] figure 2 shows a block diagram of a method of subscription information from a web page according to the second implementation of the present invention.

[0013] figure 3 shows a schematic representation of a block of a web page according to the second implementation of the present invention.

[0014] figure 4 shows a schematic representation of the first tree OMD (DOM) with the according to the second implementation of the present invention.

[0015] figure 5 shows a schematic representation of the second tree OMD (DOM) according to the second implementation of the present invention.

[0016] figure 6 shows a block diagram of a method of subscription information from a web page according to the third implementation of the present invention.

[0017] figure 7 shows the block diagram of the first device subscription information from a web page according to the fourth implementation of the present invention.

[0018] In Fig shows a block diagram of a second device subscription information from a web page according to the fourth implementation of the present invention.

DETAILED description of the INVENTION

[0019] the Following is a detailed description of the present invention using the accompanying schemes and possible implementations to technical solution and advantages of the present invention are better understood.

[0020] the First implementation.

[0021] In implementing the present invention proposes a method of subscription information from a web page. As can be seen from figure 1, in this way you can perform the following actions.

[0022] In step 101, when the user's subscription information on the web pages of the web site block the web page to which the user subscribes, is identified by its tree Document Object Model OMD (DOM) to obtain identification information.

[0023] In step 102 extract and save UR addresses of all links in the web page, to which the user subscribes. URLS in this unit are monitored in real time according to the identification information and the stored URLS. If the URL-address block of the web page there is any change, perform step 103.

[0024] In step 103 displays the web page corresponding to the changed URL.

[0025] In this step, when displaying the web page corresponding to the modified URL, the following steps are performed: saved URLS are updated according to the modified URL, i.e. a previously saved URLS are replaced with new URLS of all links in the web page to which the user subscribes. When displaying the web page corresponding to the changed URL, run the following steps: the user displays text information block of the web page on which it is signed, and the unnecessary information, such as advertising, banner, navigation information and information about copyright law, excluded from the text information. In addition, before displaying the text information block of the web page may be downloaded from the corresponding web page URL list to analyze what content the user is interested. Then the content of interest, processed, and Paul is the user displays text information block of the web page.

[0026] Because the web page can be automatically identified any block, the content provider is not required in advance to identify the content of the web page. Provides the ability to subscribe to any content block on the web page, and the number of service resources provided by the content provider, is reduced.

[0027] the Second implementation.

[0028] In implementing the present invention also proposes a method of subscription information from a web page. As can be seen from figure 2, in this way you can perform the following actions.

[0029] In step 201 accepts the user ID and the URL of the web page.

[0030] the User needs to subscribe to information from a web page. The web page includes at least one block and each block contains at least one basic unit block. In each block of the web page has a title and URL title. In each block of the web page there are several links, each of which represents the content contained on the web page.

[0031] for Example, figure 3 shows the block a web page entitled "automobile", taken from the homepage of the website . The header block of the web page - "automobile", and the URL for the header . Block web page contains basic unit unit 1 basic unit block 2 and thirteen links. Links represents the t content of the home page . In this implementation as the basic unit subscribe to a web page that is used for block web pages.

[0032] In your web page, its unit is the Div node. In this node is a Div nested multiple nodes Div. The basic unit block is also the Div node. The Div node corresponding to the base of a single block, nested in a Div node corresponding to the block of the web page. The Div node corresponding to the base unit the unit does not have sub-nodes Div. The number of characters in the basic unit block exceeds a predetermined threshold value. Typically, the threshold is set to 20.

[0033] In step 202 with the web site loads a corresponding web page by its URL.

[0034] in Order to load a web page, you need to download the code. Possible code - HTML or XML. The downloaded code is saved in a text file. After you download the code of the web page absolute path in it is changed to relative. At the same time, the web page is filled with information about relative paths Cascading Style sheets CCC (CSS) and IMG. Thus, the web page can normally be displayed to the user (which corresponds to the prior art, and in this implementation will be limited).

[0035] In step 203 in accordance with the code of the web page is created corresponding tree OMD (DOM) according to the existing met the DN analysis of documents.

[0036] the Code saved in a text file is scanned according to the method of analysis of documents to create the tree OMD (DOM)corresponding to the web page. In the method of document analysis unit of the web page is used as a tree node OMD (DOM), title, and the URL of the header block of the web page are used as sub-nodes of the node corresponding to the block of the web page, and each basic unit block of the web page is used as a sub-node of the node corresponding to the block of the web page. To simplify the description of the node used to store the title and URL of the header block of the web page in the tree OMD (DOM), is called the head node.

[0037] In step 204, the accepted unit of the web page to which the user subscribes.

[0038] When the web page is displayed to the user, he can choose the information that wants to subscribe to. Because in this implementation the basic unit subscribe to a web page is a block of a web page, it is mapped according to the position on the web page information to which the user subscribes, and identifies all of its basic unit blocks. The user can subscribe to one or more blocks of the web page. In this implementation as an example we consider the case of user's subscription for one block of the web page. Example is, the user wants to subscribe to the information in the web page shown in figure 3 (from the home page of the website ). Block web page mapped according to the position information to which the user subscribes. Also defined the basic unit blocks 1 and 2 of this web page. User ID - ID1, and the URL for the homepage of the website . com.

[0039] In this implementation you can also subscribe to the web page is the recommended method. In particular, each time registers the header block of the web page to which the user subscribes. When the web page is displayed to the user, it selects the corresponding block according to the registered title. The user is advised to confirm the selected block of web pages. If the user decides to subscribe to the selected block executes step 205. If the user does not want to subscribe to the selected block of web page, it repeats the operation of a subscription to the requested information. For example, assume that the user has subscribed to block web pages "automobile". Registered the title "automobile" in this block. When the user re-subscribes to information from the homepage of the website , this page is automatically selected block "automobile" and the user is advised confirm what rdit it. If the user decides to subscribe to block web pages "automobile", you step 205; otherwise, the user repeats the operation of subscription information from the home page .

[0040] In step 205 retrieves identification information of the block of a web page through its identification. The identification information contains at least the sequence number of the first basic unit block in the web page title and URL host header the block header of the web page, and the number of basic unit blocks in the web page.

[0041] Performed the following steps(1)-(4).

[0042] (1) Identifies the ordinal number of the first basic unit block in the web page and the number of basic unit blocks in the web page.

[0043] the Initial value of the variable is set to 0. Traversing the tree OMD (DOM) block of a web page according to the current algorithm pre-order traversal. After passing the node corresponding to the base unit to the value of the variable is added 1. At the same time, the value of the variable is used as the sequence number of the basic unit block. The tree traversal OMD (DOM) continues. After completing the tree OMD (DOM) defines the sequence number of the nodes corresponding to each of the base is in a single block. It should be noted that for the same unit web page the header node and the nodes corresponding to each basic unit block are continuous. Therefore, when the pre-order traversal first node of the traversal is the header node. Then you are crawling nodes corresponding to each basic unit block.

[0044] for Example, as can be seen from figure 4, as a node And use the unit web page, shown in figure 3. As the three subnodes of the node And use the title and the URL of the header, the base unit unit 1 and base unit unit 2 unit web pages. Three sub-nodes are the node, the node 12 and the node 13 and the node is In the node header. In addition, the initial value of the variable is set to 0. The tree traversal OMD (DOM) is performed according to an existing algorithm pre-order traversal. Assume that after passing the basic unit blocks 1 and 2 in the tree OMD (DOM) the value of the variable increased to 11. At this point, is still incremented by 1 and reaches 12. A value of 12 is used as the sequence number of the node 12, the corresponding basic unit block 1. After passing through node 13, the corresponding basic unit block 2, to the value of the variable is added to 1, and it becomes equal to 13. The value of 13 is used as the sequence number of the node 13,the corresponding basic unit block 2. The traversal continues until you have covered the entire tree OMD (DOM).

[0045] That is, for each basic unit block in the block of the web page performed first traversal of the tree OMD (DOM). After passing the node corresponding to the basic unit block, the number of this node is used as the sequence number of the basic unit block. As a first basic unit block uses the basic unit block with the minimum sequence number. As the sequence number of the first basic unit block in the web page using the minimum sequence number. Determines the number of basic unit blocks in the web page.

[0046] for Example, for a basic unit blocks 1 and 2 of block a web page, shown in figure 3, are performed first traversal of the tree OMD (DOM), are presented in figure 4. After passing through node 12, the corresponding basic unit block 1 as the sequence number of this block is used the number 12. After passing through node 13, the corresponding basic unit block 2, as the sequence number of this block uses the number 13. As a first basic unit block in the web page is the basic unit block with the minimum sequence number. As the sequence number of the first basic unit block in the block web pages use is facilitated by the number 12. The number of basic unit blocks in the web page 2.

[0047] (2) Read prefixes URLS of all links in the web page. Calculates the number of prefixes URL of each type. The URL prefix of the type which corresponds to the maximum number of prefixes is selected as the URL prefix of the block of the web page.

[0048] the URL multiple links in the web pages are classified according to their structures. In the initial part of the URL each category has a common string that represents the URL prefix category.

[0049] the URL of the majority or all of the reference block of the web page have the following structure: "the URL address of a block of a web page + subtable content. URL of some links in the web page may have other structures. In the web page shown in figure 3, the URL structure of most of the following links: " + subtable content. For example, the URL links "luxury cars enclose land in second and third tier cities" - 1119/000082 .htm. Therefore, for all URLS, links which have a structure of type "URL-address of the block of the web page + subtable content, the URL prefix, extracted from each URL that matches the URL-address of a block of a web page or similar. The URL prefix is similar to a URL address of a block of a web page in the following cases: the URL address of a block of a web page is padstr the ka of the URL prefix, or URL prefix is the substring of the URL block web pages. For example, the prefix of the URL links "luxury cars enclose land in second and third tier cities" can be the following line: . The prefix of the URL matches the URL-address of a block of a web page. Another example: the URL links "luxury cars enclose land in second and third tier cities may also be the address http://auto.qq.eom/a. URL-the address of a block of a web page is a substring of the URL prefix, i.e. they are similar.

[0050] Because URL addresses most or all of the links in the web page have a structure of type "URL-address of the block of the web page + subtable content", URL prefixes addresses most or all of the links coincide with the URL address of a block of a web page or similar. Therefore, as the URL prefix block the web page is the prefix of the type which corresponds to the largest number of prefixes.

[0051] (3) In the tree OMD (DOM) searches for a node in the block header of the web page in accordance with the selected prefix of the URL.

[0052] In particular, in the tree OMD (DOM) is searched forward starting from the node corresponding to the first basic unit unit unit web pages. After detection of the header node is determined, identical or similar URL in it the URL prefix. If Yes, the header node is a node in the block header of the web page; otherwise, obj the d-tree OMD (DOM) continues.

[0053] the forward Search is performed in the direction opposite to the direction of traversal of the tree OMD (DOM) in direct order. Direction reverse to the direction of the pre-order traversal.

[0054] for Example, assume that in step (2) obtained the following URL prefix block the web page shown in figure 3: . Search forward in the tree OMD (DOM) is performed from the first base of a single block, i.e. the node 12, the corresponding basic unit block 1. After the discovery of the site In the header of the read URL . Thus, it is determined that the URL is the same URL prefix. Therefore, the host header is the header node of a block of a web page, shown in figure 3.

[0055] (4) reads the URL and header stored in the header node, to obtain the title and URL title of this node.

[0056] for Example, from a node In the header is read, the title is "automobile" and the URL "header".

[0057] Thus, the user ID, the URL of the web page, and identification information of the block of the web page can be saved as a record according to the relationship between these components.

[0058] for example, assume that the user ID is ID1, the URL of the web page , the sequence number of the first basic unit block in the block web pages - 12, title and URL for the head node, the block header of the web page - "automobile" and , accordingly, the number of basic unit blocks - 2. In this case, the information can be stored as a record, as shown in table 1.

User ID The URL of the web page Identification information
The sequence number of the first basic unit block site title The URL of the node The number of basic unit blocks
ID1 http://www.qq.com 13 automobile http://auto.qq.com 2
... ... ...
Table 1

[0059] In step 206 is read and stored URLS corresponding to all links in the web page on which you subscribe, and URLS can be stored in the previously created record according to the user ID and the URL of the web page.

[0060] in Addition, when reading URLS can be configured timer for the Opera the foster change the URLS in the web page. The timer can be set by a user in accordance with the requirement or default. The timer is typically set small, for example half an hour or one hour.

[0061] Assume that a block of a web page, shown in figure 3, read 13 URL: S1, S2, S3, S4, S4, S6, S7, S8, S9, S10, S11, S12 and S13. In accordance with the user ID (ID1) and the URL of the web page () 13 URLS are stored in the record, as shown in table 2. Then for this entry sets the timer.

User ID The URL of the web page URL block web pages that subscribe
ID1 http://www.qq.com S1, S2, S3, S4, S5, S6, S7, S8, S9, S10, S11, S12 and S13
... ... ...
Table 2

[0062] In step 207 is performed tracking URLS to block the web page according to the received identification information and all saved URLS. If the URL of any changes you step 208.

[0063] Performed the following steps 1-4.

[0064] In step 1 on timer has expired, gastroen is th at step 206, from the stored record is read identification information according to the user ID and the URL of the web page. The identification information contains at least the sequence number of the first basic unit block in the web page title and URL host header the block header of the web page, and the number of basic unit blocks in the web page.

[0065] for Example, assume that in step 206 for recording set the timer. After a time this timer from table 1, contains information about the relationship between the user ID, the URL of the web page, and identification information, reads the identification information corresponding to the identifier ID1 and the address stored in the record. This information includes the sequence number 12 of the first basic unit block in the web page, the heading "automobile" and the URL of the header node, and the number 2 basic unit blocks in the web page.

[0066] In step 2 load the corresponding web page by its URL. Code of the web page is re-created her tree OMD (DOM) according to the existing method of analysis of documents. Crawled newly created tree OMD (DOM) in order to obtain the sequence numbers of the nodes corresponding to each basic unit block in this tree.

[0067 Structure of the downloaded web page, possibly changed, resulting in the structure of the newly created tree OMD (DOM) is different from the tree structure of OMD (DOM)generated in step 203. As for the timer is set to a short time, the structure of the web page is not much changed. Therefore, the sequence number of the nodes corresponding to the most basic unit blocks in the tree OMD (DOM)are not changed. Even in case of change of serial numbers of some of the nodes, the difference between the old and new serial numbers usually does not exceed 3. For example, suppose that in this step, the tree OMD (DOM) block web pages with the heading "automobile" has the appearance shown in figure 5. Node header block of a web page is a node C. the Unit web page contains basic unit blocks 1 and 2, corresponding to the nodes 11 and 12. The sequence numbers of the nodes 11 and 12 11 and 12, respectively.

[0068] In step 3 on the identification information read in step 1, the search is performed in the tree OMD (DOM) nodes corresponding to all of the basic unit blocks block a web page, and extracts the URLS of all links in each node. During these steps, run the following steps(1)-(5).

[0069] (1) In accordance with the read out in step 1 of the sequence number of the first basic unit block in the block of the web page host, corresponding to the sequence number in the newly created tree OMD (DOM)is defined as the initial node.

[0070] compared with step 203, the structure of the downloaded web page in step 207 may have changed. Thus, the tree structure OMD (DOM)generated in step 207, also may have changed. Therefore, a particular starting node may be or may not be a node corresponding to the first basic unit unit unit web pages.

[0071] for Example, according to the sequence number 12 of the first basic unit block of the web page with the heading "automobile"in the tree OMD (DOM), shown in figure 5, is determined by the starting node with sequence number 12.

[0072] (2) In the newly created tree OMD (DOM) searches the header node simultaneously in the forward and backward directions, starting from the initial node. After detection of the header node of him read the title and the URL of the header.

[0073] for Example, in the tree OMD (DOM), shown in figure 5, you search for the header node simultaneously in the forward and backward directions, starting from the initial node with sequence number 12. After the discovery of the site In the header read from it the title of "automobile" and the URL of the header .

[0074] (3) Determined whether read the title and the URL of the header with the title and URL of the header, read the identification information in step 1. If Yes, the header node is a node in the block header of the web page, and executes the step (4). In rotinom is step (2).

[0075] for Example, it is determined that the read header "automobile" and the URL coincide with the heading "automobile" and the URL stored in the record in step 1. At step (4).

[0076] (4) In the newly created tree OMD (DOM) is a continuous search of the nodes in the reverse direction, starting from the head node. The number of required nodes coincides with the read in step 1 by the number of basic unit blocks in the web page.

[0077] In the tree OMD (DOM) nodes corresponding to the basic unit blocks of one block of the web page, and the header node of this block are continuous. Therefore, when the detection node, the block header of the web page nodes, which, starting from the head node corresponds to the number of basic unit blocks in the block web pages that correspond to the basic unit blocks unit web pages.

[0078] for Example, assume that the number of basic unit blocks in the web page with the heading "automobile" is 2. Then in the tree OMD (DOM), shown in figure 5, is a continuous search in the opposite direction of the two nodes starting from the node In the header. Find the nodes 11 and 12 and they are used as nodes, the respective basic unit blocks 1 and 2 of block a web page.

[0079] (5) of the nodes corresponding to all of the basic unit blocks block a web page, read the URLS of all links of all nodes, PR is than a few URLS are addresses of all links included in the block of the web page.

[0080] for Example, assume that the nodes 11 and 12 of the extracted URL S1, S2, S3, S4, S5, S6, S7, U1, U2, U3, U4, U5 and U6 all references.

[0081] In step 4 of the URLS of all links included in the block of the web page, compared with URLS of all links stored in the record. If there any changes you step 208.

[0082] In step 208 displays the web page corresponding to the changed URL.

[0083] In particular, when there is any change in the URLS of all links included in the block of the web page stored in the URL entry block of the web page on which you subscribe to is updated. For the record can be re-configured timer. The configuration is the same as in step 206. After the time expires, the timer is again determined (in accordance with the above steps), is there any change in the URL address of the block of the web page on which you subscribe.

[0084] for Example, a few references S1, S2, S3, S4, S5, S6, S7, U1, U2, U3, U4, U5 and U6 are compared with references S1, S2, S3, S4, S5, S6, S7, S8, S9, S10, S11, S12 and S13 stored in the record. References S1, S2, S3, S4, S5, S6, S7, S8, S9, S10, S11, S12 and S13, saved, replaced a few links S1, S2, S3, S4, S5, S6, S7, U1, U2, U3, U4, U5 and U6, as shown in table 3. For the record can be re-configured timer.

User ID The URL of the web page URL block web pages that subscribe
ID1 http://www.qq.com S1, S2, S3, S4, S5, S6, S7, U1, U2, U3, U4, U5 and U6
... ... ...
Table 3

[0085] Below in this implementation, the text information block of the web page on which you subscribed, is displayed to the user using the RSS technology (Really Simple Syndication). RSS is a way to extract text from a web document is a web page and display it.

[0086] In this implementation, the user may subscribe to several blocks web pages, and to receive identification information of each block. The identification information contains at least the sequence number of the first basic unit block in the web page title and URL host header the block header of the web page, and the number of basic unit blocks in the web page. Identification information of each block of the web page is stored.

[0087] Because the web page can be automatically identified any block, the content provider is not required in advance identifiziert the th content of the web page. Thus, it is possible to subscribe to any content block on the web page, and the number of service resources provided by the content provider, is reduced.

[0088] the Third implementation

[0089] As can be seen from Fig.6, in another implementation of the present invention proposes a method of subscription information from the web site. In this way you can perform the following actions.

[0090] In step 301 accepts the user ID and the URL of the web page, and the user subscribes to the required information from the web page.

[0091] In this implementation as the unit of subscription information from a web page can also be used block of the web page.

[0092] In step 302, the web site loads a corresponding web page according to its URL, and using a method of analysis of documents creates a tree OMD (DOM) of the web page code.

[0093] Then crawled tree OMD (DOM) in order to obtain the serial numbers of all its nodes.

[0094] In step 303 searches the corresponding relationship between the user ID, the URL of the web page, and identification information on the user ID and the URL of the web page. If the corresponding identification information is detected, at step 304. Otherwise engaged is tsya step 305.

[0095] If the relationship between the user ID, the URL of the web page and the identification information is detected, a record that contains the user ID and the URL of the web page, this means that the user has subscribed to the unit web page. In this implementation provides the possibility of displaying the web page to which you subscribe. The user can change the unit of the web page on which you subscribe.

[0096] In step 304, the block of the web page on which you subscribe, be identified on the web page using special background color according to the identification information and displayed to the user. It then executes step 306.

[0097] the Identification information contains the sequence number of the first basic unit block in the block of the web page on which you subscribe, title, and URL host header header of this web page, and the number of basic unit blocks in this block of the web page.

[0098] In particular, in step 1 according to the identification information searched in the tree OMD (DOM) nodes corresponding to each basic unit block in the block of the web page on which you subscribe. It performs the following steps :

[0099] (1) In accordance with the sequence number of the first basic the CSOs single block in the web page, which subscribe to a node in the tree OMD (DOM) is defined as the initial node.

[0100] (2) In the tree OMD (DOM) searches the header node simultaneously in the forward and backward directions, starting from the initial node. After detection of the header node of him read the title and the URL of the header.

[0101] (3) Determined whether read the title and the URL of the header with the title and URL of the header identification information. If Yes, the header node is a node in the block header of the web page and executes the step (4). Otherwise, perform step (2).

[0102] (4) In the tree OMD (DOM) searches backwards from the head node of the nodes, the number of which coincides with the number of basic unit blocks in the block of the web page on which you subscribed, i.e. nodes corresponding to all of the basic unit blocks in this block of the web page.

[0103] In step 2, the node corresponding to each basic unit block in the block of the web page on which you subscribe, maps basic unit block on the web page, and the background color is mapped to the base unit blocks is changed to a different color. Then the web page is displayed to the user.

[0104] All mapped basic unit blocks are located in the web page on which you subscribe. the donkey display on a web page using special background color of all the basic unit blocks in the web page, which is subscribed, the user can change on the web page, the block on which you subscribed, i.e. re-subscribe on the unit web page.

[0105] In step 305, the downloaded web page is displayed to the user.

[0106] the User can select from the web page information to which you want to subscribe to.

[0107] In step 306, the accepted unit of the web page to which the user subscribes.

[0108] In step 307 retrieves identification information of the block of a web page through its identification. The identification information contains at least the sequence number of the first basic unit block in the web page title and the URL of the header block of the web page, and the number of basic unit blocks in the web page. The user ID, the URL of the web page, and identification information used as a record and stored in the information about the relationship between these components.

[0109] This step is identical to step 205 in the second, and again here will not be considered.

[0110] In step 308 is retrieved and stored the URLS of all links included in the block of the web page on which you subscribe. There is also a continuing relationship between the user ID, the URL of the web page and the extracted URL ADR is for yourself.

[0111] This step is identical to step 206 in the second, and again here will not be considered.

[0112] In step 309 are tracked in real-time URL-block of a web page, which is produced by the subscription according to the identification information and the stored URLS. If the URL has any change, at step 310.

[0113] This step is identical to step 207 in the second, and again here will not be considered.

[0114] In step 310 displays the web page corresponding to the changed URL.

[0115] This step is identical to step 208 in the second, and again here will not be considered.

[0116] Since a web page can be automatically identified any block, the content provider is not required in advance to identify the content of the web page. Thus, it is possible to subscribe to any content block on the web page, and the number of service resources provided by the content provider/decreases.

Because the block of the web page to which you subscribe are allocated on a web page, a special background color, the impression of the user is improved.

[0117] the Fourth implementation

[0118] As can be seen from Fig.7, in implementing the present invention proposes a device subscription information from a web page. the device contains the following modules:

module 401 identification for identification (if subscribed by the user on information from a web page) block of the web page to which the user subscribes, with its tree OMD (DOM), to obtain identification information;

the module 402 tracking in real-time to extract and save the URLS of all links in the web page to which the user subscribes, and tracking URLS in the block according to the identification information and the stored URLS to determine whether there is any change in the URL; module 403 display to display a web page corresponding to the changed URL, when there is a change in the URL address of the block of the web page to which the user subscribes.

[0119] Module 403 of the display contains the following components: submodule update to update the stored URLS according to the modified URL; submodule display for displaying text information block of the web page to which the user subscribes.

[0120] the Device may also contain a module pre-create to create the tree OMD (DOM) of the web page.

[0121] Module 401 identification may contain the following components: a first receiving unit for receiving from a tree OMD (DOM) of the web page sequence number of the first b is the basic unit block in the web page, to which the user subscribes, and the number of basic unit blocks included in the block of the web page;

a second receiving unit for receiving the URL prefix of the block of the web page to which the user subscribes;

the first block search to search by URL prefix in the tree OMD (DOM) block of the web page host, the block header of the web page to which the user subscribes, and removing the title and URL title of this node.

[0122] as identification information used sequence number of the first basic unit block in the web page to which the user subscribes, the number of basic unit blocks in this block of the web page, and the title and URL host header the block header of the web page to which the user subscribes.

[0123] the First receiving unit may contain the following components: the sub-block traversal to traverse the tree OMD (DOM) block of the web page and, after passing through the node corresponding to the basic single-block read sequence number of this node as the sequence number of the basic unit block;

the sub-block to select the sequence number of the basic unit block with the minimum sequence number as the sequence number of the first basic unit block in the web page;

first Pablo the definition to determine the number of basic unit blocks, included in the block of the web page to which the user subscribes.

[0124] the Second receiving unit may contain the following components:

the second subsection definitions for extracting prefixes URLS of all links in the web page to which the user subscribes, determine the number of prefixes URL of each type and choice as the URL prefix of the block of the web page to which the user subscribes, the prefix of the type which corresponds to the maximum number of prefixes.

[0125] the First search block may contain the following components:

the subblock search to search for the header nodes in the tree OMD (DOM) of the web page in the forward direction, starting from the node corresponding to the first basic unit block;

the second subsection of the search to find additional host header host header with the URL prefix that matches with the received URL prefix or similar, and use the found node as a node in the block header of the web page and extract the title and URL of the header.

[0126] the Module 402 tracking in real-time can contain the following components:

the block reader for reading identification information and the stored URL;

unit-create to create the tree OMD (DOM) of the web page;

the block definition for the determining of the starting node in the tree OMD (DOM)

according to the sequence number of the first basic unit block in the web page to which the user subscribes;

the second block search to search in the tree OMD (DOM) nodes corresponding to the basic unit blocks in the block of the web page to which the user subscribes, given a particular starting node, title, and URL host header title, as well as the number of basic unit blocks in the web page;

a Comparer for comparing a URL in the node corresponding to each basic unit block in the web page with the saved URL.

[0127] the Second search block may contain the following components:

the third subsection of the search to find the host header by its title and URL title in the tree OMD (DOM) simultaneously forward and backward from the starting node;

the fourth subblock search for continuous search tree OMD (DOM), starting from the head node, the nodes, the number of which is equal to the number of basic unit blocks in the web page, and the wanted are the nodes corresponding to the basic unit blocks in the web page.

[0128] As can be seen from Fig, the device may also contain a module 404 definitions to determine whether the web page of the block to which you subscribe, and display this block on the web page rank is of special background color.

[0129] In implementations of the present invention on a web page can be automatically identified any block. Therefore, the content provider is not required in advance to identify the content of the web page. Thus, it is possible to subscribe to any content block on the web page, and the number of service resources provided by the content provider, may be reduced.

[0130] All provided in implementations of the present invention and described above technical solution (or part thereof) may be implemented by a program stored in a machine-readable storage medium, such as hard disk, CD-ROM or floppy disk.

[0131] All described and illustrated here is an example of the invention along with some of its variants. Used here, the terms, descriptions and diagrams are for illustration only and do not serve as limitations. Possible many of the changes do not go beyond the nature and scope of the present invention defined by the following points formulas (and their equivalents), in which all terms are used in the broadest appropriate sense, unless specified otherwise.

1. Way of subscription information from a web page, which includes the following steps:
unit identification web page to the th is signed by the user, using the first tree Document Object Model OMD (DOM) of the web page to obtain identification information;
extracting URLS of all links in the web page to which the user subscribes, and tracking in real-time URL in a block according to the identification information and the stored URLS to determine whether there is any change in the stored URL;
displaying the web page corresponding to the changed URL, if the URL-address block of the web page to which the user subscribes, there is any change.

2. The method according to claim 1, wherein when displaying the web page corresponding to the changed URL, perform the following steps:
update the saved URLS according to the modified URL;
displaying text information block of the web page to which the user subscribes.

3. The method according to claim 1, which also provided the following action:
before the identification of the block of the web page, which is signed by the user, using the first tree OMD (DOM) of the web page to obtain identification information to create the first tree OMD (DOM) of the web page.

4. The method according to claim 1, characterized in that the identification block of the web page to which the user subscribes the motor, using the first tree OMD (DOM) of the web page to obtain identification information, perform the following steps:
receiving from the first tree OMD (DOM) of the web page sequence number of the first basic unit block in the web page to which the user subscribes, and the number of basic unit blocks included in the block of the web page;
getting the URL prefix of the block of the web page to which the user subscribes;
search for the URL prefix in the first tree OMD (DOM) of the web page host, the block header of the web page to which the user subscribes, and removing the title and URL of the header of this site.
moreover, the identification information includes the sequence number of the first basic unit block in the web page to which the user subscribes, the number of basic unit blocks included in the block of the web page, and the title and URL host header header.

5. The method according to claim 4, characterized in that the node corresponding to the base unit to unit, do not contain any other node, and the number of characters in the basic unit block exceeds a predetermined threshold value.

6. The method according to claim 5, characterized in that the threshold value is 20.

7. The method according to claim 4, characterized in that upon receipt of the first tree OMD (DOM) of the web page is less than the sequence number of the first basic unit block in the web page, which is signed by the user, perform the following steps:
the pre-order traversal of the first tree OMD (DOM) of the web page and, after passing through the node corresponding to the basic unit block in the web page to which the user subscribes, reading the sequence number of this node as the sequence number of the basic unit block;
select the sequence number of the basic unit block having the minimum sequence number in the block of the web page to which the user subscribes, as the sequence number of the first basic unit block on the web page to which the user subscribes.

8. The method according to claim 4, characterized in that when receiving the number of basic unit blocks included in the block of the web page to which the user subscribes, perform the following steps:
the pre-order traversal of the first tree OMD (DOM) of the web page and determining the number of basic unit blocks included in the block of the web page to which the user subscribes.

9. The method according to claim 4, characterized in that upon receipt of the URL prefix of the block of the web page to which the user subscribes, the following steps are performed:
removing prefixes URLS of all links in the web page, which is signed by the user, determining the number of prefixes the url addresses of each type and choice as the URL prefix of the block of the web page, to which the user subscribes, the prefix of the type which corresponds to the maximum number of prefixes.

10. The method according to claim 4, characterized in that the search in the tree OMD (DOM) of the web page host, the block header of the web page to which the user subscribes, perform the following steps:
the search for suitable host headers in the first tree OMD (DOM) of the web page in the forward direction from the node corresponding to the first basic unit block in the web page to which the user subscribes;
search among candidate nodes headers of the appropriate host header, the URL of which is identical or similar to the URL prefix, and the definition found a suitable host as a host header for the web page to which the user subscribes.

11. The method according to claim 4, characterized in that when tracking URLS in the web page, which is signed by the user according to the identification information and the stored URLS to determine whether there is any change in the URL, perform the following steps:
reading identification information and the stored URL;
creating a second tree OMD (DOM) of the web page;
the definition of the starting node of the second tree OMD (DOM) according to the sequence number of the first basic unit block in locaweb page to which the user subscribes;
search in the second tree OMD (DOM) nodes corresponding to the basic unit blocks in the block of the web page to which the user subscribes, with the initial node,, title, and URL host header title, as well as the number of basic unit blocks in the block of the web page to which the user subscribes;
comparison of URLS in nodes corresponding base unit blocks within the block of the web page to which the user subscribes, saved URLS.

12. The method according to claim 11, characterized in that the search in the second tree OMD (DOM) nodes corresponding to the basic unit blocks in the block of the web page to which the user subscribes, with the initial node,, title, and URL host header title, as well as the number of basic unit blocks in a block of a web page, which is signed by the user, perform the following steps:
search host header by its title and URL of the header in the second tree OMD (DOM) simultaneously forward and backward from the starting node;
search in the second tree OMD (DOM) in the opposite direction from the header node of the nodes, the number of which coincides with the number of basic unit blocks in the block of the web page to which the user subscribes, and the wanted are the nodes corresponding to BA the new single blocks in the web page, to which the user subscribes.

13. The method according to claim 1, which includes the following steps:
before the identification of the block of the web page to which the user subscribes, using her first tree OMD (DOM) to obtain identification information, determine whether there is on this web page this block; if no such block exists, it is displayed on the web page by highlighting a special background color.

14. The device subscription information from a web page that contains the following modules:
an identification module to identify a block of a web page, which is signed by the user, using the first tree Document Object Model OMD (DOM) of the web page to obtain identification information;
the tracking module in real-time to extract and save the URLS of all links in the web page to which the user subscribes, and tracking URLS in the block according to the identification information and the stored URLS to determine whether there is any change in the URL;
a displaying module for displaying the web page corresponding to the changed URL, if the URL-address block of the web page to which the user subscribes, there is any change.

15. The device according to 14, wherein modulating the display contains the following components:
the update module to update the stored URLS according to the modified URL;
the submodule display for displaying text information block of the web page to which the user subscribes.

16. The device 14 also contains:
module pre-create to create the first tree OMD (DOM) of the web page.

17. The device according to 14, wherein the identification module includes the following components:
a first module for receiving from the first tree OMD (DOM) of the web page sequence number of the first basic unit block in the web page to which the user subscribes, and the number of basic unit blocks in this block of the web page;
a second module for receiving the URL prefix of the block of the web page to which the user subscribes;
the first search module to search the URL prefix in the first tree OMD (DOM) of the web page host, the block header of the web page to which the user subscribes, and retrieve the title and URL of the header of this site.
moreover, the identification information includes the sequence number of the first basic unit block in the web page to which the user subscribes, the number of basic unit blocks in this block of the web page, and the title and URL host header zagalo the ka.

18. The device according to 17, wherein the first module receiving contains the following components:
subsection crawl for pre-order traversal of the first tree OMD (DOM) of the web page and, after passing through the node corresponding to the base unit unit unit web page, reading the sequence number of this node as the sequence number of the basic unit block;
the sub-block to select the sequence number of the basic unit block having the minimum sequence number in the block of the web page to which the user subscribes, as the sequence number of the first basic unit block on the web page to which the user subscribes;
the first sub-block definition to determine the number of basic unit blocks in the block of the web page to which the user subscribes.

19. The device according to 17, characterized in that the second receiving unit includes:
the second subsection definitions for extracting prefixes URLS of all links in the web page to which the user subscribes, determine the number of prefixes URL of each type and choice as the URL prefix of the block of the web page to which the user subscribes, the prefix of the type which corresponds to the maximum number of prefixes.

20. The device according to 17, characterized in that the first block of the ska contains the following components:
the first subsection of the search to find a suitable host headers in the first tree OMD (DOM) of the web page in the forward direction from the node corresponding to the first basic unit block in the web page to which the user subscribes;
the second subsection of the search to find additional suitable host headers for a suitable host with the same or similar URL, title and URL prefix as the host header block of a web page, which is signed by the user, and retrieve the title and URL title of this node.

21. The device according to 14, wherein the tracking module in real-time includes the following components:
the block reader for reading identification information and the stored URL;
block creation to create a second tree OMD (DOM) of the web page;
the block definition to define the initial node in the second tree OMD (DOM) according to the sequence number of the first basic unit block in the web page to which the user subscribes;
the second block search to search in the second tree OMD (DOM) nodes corresponding to the basic unit blocks in the block of the web page to which the user subscribes, with the initial node,, title, and URL host header title, as well as the number of basic unit blocks in the web page is nice, to which the user subscribes;
a Comparer for comparing the URLS in the sites corresponding to the basic unit blocks, with the saved URL.

22. The device according to item 21, wherein the second search block contains the following components:
the third subsection of the search to find the host header by its title and URL of the header in the second tree OMD (DOM) simultaneously forward and backward from the starting node;
the fourth subblock search to search in the second tree OMD (DOM) in the opposite direction from the header node of the nodes, the number of which coincides with the number of basic unit blocks in the block of the web page to which the user subscribes, and the wanted are the nodes corresponding to the basic unit blocks in the block of the web page to which the user subscribes.

23. The device 14 also contains:
a determining module for determining whether the web page of the block to which you subscribe, and display this block on the web page by highlighting a special background color.

 

© 2013-2014 Russian business network RussianPatents.com - Special Russian commercial information project for world wide. Foreign filing in English.