 |
|
 |
Home > WebZIP > Examples >
Example 2 - Wired Magazine.
Posted: 27 August 2001 by Spidersoft Team, Melbourne, Australia
Aim:
Capture the August 2001 issue of the Wired Magazine.
Content organization:
Each monthly issue is organized within its own sub-directory.
eg.
http://www.wired.com/wired/archive/9.07/
http://www.wired.com/wired/archive/9.08/
Method:
Since each issue is contained within its own sub-directory, we set the page location filter to "Within current directory"
Each magazine article also contains pages for printer format. We can prevent WebZIP from downloading these redundant pages by using a URL filter.
If you hover over each "print" link you'll find that each URL contain "_pr" as a common term.
So we add this to our URL Exclude Filters.
[PL]_pr
This simply means 'exclude all page links containing "_pr" in their URL'.
Task Summary:
![]() Task Name: | wired_mag_Aug2001 |
| Task Folder: | E-Zines |
| Start URL(s): | http://www.wired.com/wired/archive/9.08/ |
| Save to folder: | D:\My Intranet\wired_mag_Aug2001\ |
| Profile: | |
| Filetypes: | All |
| Followed links - Levels: | All levels |
| Followed page links - Location: | Within current directory |
| Followed media links - Location: | Within current site |
| Include Filters: | |
| Exclude Filters: | [PL]_pr |
| Link Conversion - Followed Links: | Convert ALL followed links to relative links |
| Link Conversion - Unfollowed Links: | Convert unfollowed links to absolute links |
| Schedule: | Don't schedule this task |
Task File:
wired_mag_Aug2001.wzt
|
|
 |

|