Gaming How to mass edit a list of urls??

Mythical

Well-Known Member
OP
Member
Joined
May 11, 2017
Messages
2,153
Trophies
1
Age
25
XP
3,013
Country
United States
I've been trying to download a bunch of images for cards from a site before it goes down for good.
I have a list of the cards images urls in csv format and can use said links to mass download the images of the cards.
By default I couldn't get the full res image into the list.
the only difference between the low res and enlarged versions in reference to their urls is an extra /big/ in between some id numbers.
Thing is said id numbers aren't consistent before or after any part of the url.
I've tried using notepad++ to do this but it seemed really inconsistent with it given how the urls are chosen and hardly saves any time (I'd also be worried I'd miss some and I'm very limited on time)
Here's an example list of the low res urls I want to change to high res urls to save time:
"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/FC-9.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/HOL-2.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/FC-26.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/FC-10.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/FC-14.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/FC-27.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/876/PR6-2.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/876/OP5-4.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/876/OP6-1.jpg"
"
And here's a before and after example that's consistent throughout the list (you may click on them, but it seems to be the larger versions won't load unless you've loaded the smaller versions from a link list or from the site)
(this hasn't harmed my ability to download the high res images in bulk (which I can do once I edit this long url list I have))
http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/876/OP6-1.jpg
http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/876/big/OP6-1.jpg

If anyone could help me with some advice or a program recommendation that could help cut down on time editing this list (much more than the example list) that'd be great :D otherwise I'm surprised you got this far, but have a chill day

If anyone is wondering I'm trying to preserve some older card games and bring them to Tabletop simulator before they may disappear into the internet abyss
 
Last edited by Mythical,

Enkuler

Well-Known Member
Newcomer
Joined
Jan 25, 2017
Messages
97
Trophies
0
XP
456
Country
France
I'm not sure I understood everything, but I tried this in notepad++
edit: sorry for french notepad++, that's the replace dialog.

2019-06-30 08_12_00-Remplacer.png


Pasting the two fields here so you don't have to fill them manually.
Code:
(http://.*/)([^/]*\.jpg)
\1big/\2
It transformed your list into that
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/big/FC-9.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/big/HOL-2.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/big/FC-26.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/big/FC-10.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/big/FC-14.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/877/big/FC-27.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/876/big/PR6-2.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/876/big/OP5-4.jpg"
"http://www.tradecardsonline.com/img/cards/fullmetal-alchemist/876/big/OP6-1.jpg"
 
Last edited by Enkuler,
  • Like
Reactions: Mythical

Mythical

Well-Known Member
OP
Member
Joined
May 11, 2017
Messages
2,153
Trophies
1
Age
25
XP
3,013
Country
United States
That's perfect and I don't mind the french at all (: . I actually know a little (I knew more once upon a time)!
Thank you for the help and timely reply :D. I didn't know they had such a notation until now.
With this I should be able to scrape together quite a few card games.
 

CMDreamer

Well-Known Member
Member
Joined
Oct 29, 2014
Messages
1,719
Trophies
1
Age
39
XP
3,571
Country
Mexico
Using Notepad++, you can mass edit any text based file using RegExp, which is a very powerfull text editing tool.

My advice, is to first clean the text file from all the useless (for your purpose) text/HTML tags, and then apply RegExp to "format" the URL's as needed.

From your example, I'd use tokens to separate the first URL (the non "BIG" ones), and then using the same tokens, catenate them with "big" to form the final URL. By using patterns, everything's really easy.

Edit:
Another "easy" way, is to use a mass image downloader, they automatically fetch the images URL's and download them.
 
Last edited by CMDreamer,
  • Like
Reactions: Mythical

Cyan

GBATemp's lurking knight
Former Staff
Joined
Oct 27, 2002
Messages
23,749
Trophies
4
Age
46
Location
Engine room, learning
XP
15,662
Country
France
for mass download I use DownThemAll browser extension. right-click a page, and you are done!
it has filter support and automatic file renaming if needed.
It automatically select links to download based on the file extension, or filename, or all the urls if you want to.

For notepad++, the default regexp is not that powerful and miss lot of features such as new line, one per line, text wrap, etc.
I use the additional n++ extension RegRexPlace which is very good, but might be a little hard to learn. once installed, press CTRL+R to open the regexplace menu.
 
Last edited by Cyan,
  • Like
Reactions: Mythical

kuwanger

Well-Known Member
Member
Joined
Jul 26, 2006
Messages
1,510
Trophies
0
XP
1,783
Country
United States
for mass download I use DownThemAll browser extension.

There's also "wget -i url_list_file". I don't know the syntax of curl, but it's pretty similar.

For notepad++, the default regexp is not that powerful and miss lot of features such as new line, one per line, text wrap, etc.

To be fair, very few regex programs deals with new lines. The general trick in Linux is to do something like tr \\n \\v | ... | tr \\v \\n, turning new lines to vertical tabs and back again with the regex in between. GNU regex has its own set of extensions, as does Perl. Personally I try to avoid anything too fancy because I'd rather not learning something that doesn't transfer over trivially.
 
  • Like
Reactions: Mythical

Mythical

Well-Known Member
OP
Member
Joined
May 11, 2017
Messages
2,153
Trophies
1
Age
25
XP
3,013
Country
United States
I
How are you downloading these? It may be worth investing in writing a script in Python or Node to go through and save the contents of each link rather than doing it by hand.
I'm using Jdownloader 2 to download the images of the finished links. Problem is I woke up this morning to the site seeming to be down a day early :/
I've been trying to use octoparse to download all the high res images urls from http://www.tradecardsonline.com/im/selectCard/game_id/56/goal/ here from each expansion.
I understand coding for web scraping is a great idea I just was limited on time :D
edit: it seems to load just really really slowly or giving me some form of backup with no accesibility. I hope to get all the cards before it starts acting too unpredictable lol
edit 2: it seems like there's nothing to be done at this point to save em (unless I'm missing something). RIP, thanks for all the helpful answers though all!
It's a sad day for trading card games
 
Last edited by Mythical,

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
  • SylverReZ @ SylverReZ:
    @Xdqwerty, The response from the page said it had too many connections. So perhaps, flooded traffic?
  • Xdqwerty @ Xdqwerty:
    @SylverReZ, it didnt say that for me
  • SylverReZ @ SylverReZ:
    @Xdqwerty, It only shows that if you do 'View Source', right below that it'll tell you the SQL response query. I feel like a nerd lol.
  • SylverReZ @ SylverReZ:
    But for everyone, it said unexpected database error.
    +1
  • BigOnYa @ BigOnYa:
    Just said database error for me
    +2
  • Xdqwerty @ Xdqwerty:
    @BigOnYa, she said it only said "too many connections" when you view the source
    +1
  • Xdqwerty @ Xdqwerty:
    btw gonna try to actually beat touhou 6
    +1
  • SylverReZ @ SylverReZ:
    @Xdqwerty, Have fun.
    +1
  • Xdqwerty @ Xdqwerty:
    @SylverReZ, i recall playing some hard undertale fangames way before playing touhou, like the genocide asgore fangame or the mettaton neo 2.0 one
  • Xdqwerty @ Xdqwerty:
    so atleast im already kinda used to bullet hell games
  • Xdqwerty @ Xdqwerty:
    and yea i played both undertale and deltarune too although those are a piece of cake compared to any touhou game
  • Xdqwerty @ Xdqwerty:
    aaaaaaannnnnnd i already lost all my continues
  • Xdqwerty @ Xdqwerty:
    I only set three default lives btw
  • Xdqwerty @ Xdqwerty:
    cuz of score
  • Xdqwerty @ Xdqwerty:
    i think i first should focus more on beating the game rather than obtaining a high score
  • Xdqwerty @ Xdqwerty:
    good night
  • K3Nv2 @ K3Nv2:
    yawn
  • BigOnYa @ BigOnYa:
    -deleted-
  • K3Nv2 @ K3Nv2:
    Stop ctrl cing me
  • BigOnYa @ BigOnYa:
    Sorry, here ctrl-Z
  • K3Nv2 @ K3Nv2:
    Ctrl u 2 u
  • BigOnYa @ BigOnYa:
    Damn, that turned my tv channel, you got powers
  • K3Nv2 @ K3Nv2:
    The fbi will be knocking on your door soon
  • Psionic Roshambo @ Psionic Roshambo:
    They all went mad after looking at my browser history
    Psionic Roshambo @ Psionic Roshambo: They all went mad after looking at my browser history