0:01 drewr: Golly, if I want to get the mtime of a file, Google tells me I need to use the Tomcat FileInfo class. Is there a better way?
0:02 Surely there's something in java.io.*.
0:06 abrooks: drewr: Java boldly refuses to acknowledge that there is an underlying platform. What you're looking for may be there but I suspect not.
0:08 jonathan__: hmmm, see this --> http://
0:11 but it looks like java.io.File can get you the last modified --> http://
0:11 abrooks: posix.* is not part of the Java distribution from anyone. :(
0:19 drewr: Heh, lastModified()... ugh.
0:19 jonathan__: === mtime ?
0:20 drewr: What about the other things you might need to know? inode, symlink, etc.?
0:22 abrooks: Java is its own platform. It's not a good platform for building system tools without third-party classes (JNI based).
0:23 jonathan__: yeah, I work in "enterprise" software, and we'd typically never need stuff like that ... sadly we use C++, if only we could use Java
0:23 abrooks: There are lots of RFIs for platform support.
0:24 drewr: This philosophy never made sense to me. So many problems have been solved by operating systems that you shouldn't have to re-solve. :-)
0:24 jonathan__: RFI?
0:24 abrooks: The GNU Classpath project is extending some base classes. It would be nice if they'd support posix-y-gnu-ish interfaces.
0:25 jonathan__: RFE, sorry. Request For Enhancement.
0:29 drewr: I'm looking at Clojure for migrating some data concurrently between SQL Server and Postgres with JDBC. *That* should be well-supported.
0:29 abrooks: That would be Java's domain. :)
0:29 jonathan__: Ok, I don't know about pg, but the jtds 1.2 driver works like a champ with sql server
0:30 and the pure java Oracle thin drivers rock also
0:30 drewr: jonathan__: Awesome, thanks.
0:31 I've used the thin driver for ORA before.
0:31 It did work well.
0:31 jonathan__: I tried and tried but *strangely*, the MS driver for SQL Server completely failed to connect
0:31 * drewr researches pg options
0:31 jonathan__: </sarcasm>
0:34 Wonder what the best way of approaching this would be. Have agents bite off a chunk of rows and each work independently?
0:34 jonathan__: What are you trying to do?
0:35 drewr: We've got massive amounts of data that comes off our telecom platform, which only talks SQL Server.
0:36 In order to do manipulate it and report on it, we bring it over to PG.
0:36 The process for doing that is extremely slow.
0:37 I think that doing it concurrently will speed things up.
0:39 jonathan__: What's the fastest that pg will slurp in data? Can you generate a bulk insert file? Or are you using other methods?
0:39 (assuming pg supports stuff like that)
0:40 drewr: I've only tried DTS with SQL Server so far.
0:40 It's dog-slow.
0:40 Literally days to get a single dump.
0:41 That's why I'm going to write something that's more efficient, but if I do it sequentially I'm afraid I'll have the same problem.
0:41 ...doing 100 or 1000 rows at a time.
0:44 jonathan__: so you use DTS to generate data to a text file?
0:44 drewr: So my n�ive idea is to have a pointer to the current row that gets updated in a Clojure transaction every time an agent grabs his dataset.
0:44 jonathan__: No, it moves it straight into PG.
0:50 jonathan__: Sounds like the overhead of using DTS/ODBC(?) may be the problem, rather than being sequential ... but obviously I could be totally wrong
0:52 drewr: True, it could be. I need to profile it better to see where the bottleneck is.
0:56 jonathan__: Assuming round-tripping is the problem, I'd be looking to try and generate something that could be read by the copy command ... http://
0:59 Hopefully SQL server should be able to spit out CSV files at 10s of k rows a sec
1:00 versus 200 rows a sec which sounds like what you may be seeing
1:00 drewr: That's probaby the ballpark
1:00 I don't really want to generate intermediate data, but I may have to.
1:02 jonathan__: yeah, escaping text data can be a pain etc ...
1:03 which reminds me, does emit escape data yet ... *my* version does :)
1:17 Let me know if you see any data errors. The format obviously needs improvement.
1:17 drewr: Chouser: Cool, thanks.
1:17 I'm off to bed. Thanks for the brainstorming guys.
1:18 jonathan__: cool, should there be a notice that the channel is archived? or is that pretty common for irc?
1:19 Chouser: jonathan__: I dunno. To suggest that anything said here is private seems a bit of a stretch though.
1:20 it's not automatically updated yet. Hopefully I can add that tomorrow.
1:20 rhickey already mentioned he liked the idea. I guess if people have objections I can take the pages back down.
1:21 Past my bedtime. Later!
1:21 jonathan__: cheers
20:15 that's the IRC log for the last couple months.
20:15 rhickey: cool
20:15 Chouser: I think it'd be most useful if we can get Google to index it.
20:16 rhickey: are you interested in hosting it at clojure.org, or should I let google have at it on my own domain?
20:17 rhickey: clojure.org maps to sf right now
20:17 Chouser: ok, that's fine. It's just html and js file, no cgis or servlets or anything.
20:17 or n01se.net/clojure_log is fine with me too, just thought I'd ask.
20:17 rhickey: I'd have to get some automated way to upload it regularly
20:18 Chouser: yeah, rsync over ssh would be preferred (that's how I'm getting it onto n01se), but ftp or whatever is fine too.
20:20 rhickey: Let me think about it - still catching up, was away this weekend
20:25 Chouser: np
20:25 and no rush either